buffer overflow some how executing code?

S

Saran

Ok, every so often I run across an article in a forum somewhere that
given a "buffer overflow" a hacker can execute code on the system.

This just seems like a load of bunk to me. I've been programming in
various languages, including, though not limited to, c and cpp, and I
haven never once encountered a situation where writing past the bounds
of a buffer, which is just an array of characters, to suddenly be
converted into some sort of "magical code" that can suddenly wreak
havoc.

In any programming I've done where you can write outside of the bounds
of the buffer (char array), you get UNDEFINED behavior, not some magical
power. Even the C and C++ specs state this.

Can someone please explain to me where this comes from. One example I
just read was an IE6 exploit where using a url that's too logn and
contains "unusual" characters can allow a "hacker to run code on the
system." Again, these look liek total bunk to me, as a URL is just text,
and writting past the bound of the buffer just isn't going to give soem
REMOTE hacker the ability to suddenly jump into your system, or some put
code in there.

Can anyone pelase clear this up? If I'm missing something here please
let me know.
 
D

Don Taylor

Saran said:
Ok, every so often I run across an article in a forum somewhere that
given a "buffer overflow" a hacker can execute code on the system.
This just seems like a load of bunk to me. I've been programming in
various languages, including, though not limited to, c and cpp, and I
haven never once encountered a situation where writing past the bounds
of a buffer, which is just an array of characters, to suddenly be
converted into some sort of "magical code" that can suddenly wreak
havoc.

Ok. (None of this is secret, or being kept from those who want it)

With a single stack processor architecture, the return address for the
current procedure or function is stored on the stack. Below that are
all the local variables for the current procedure or function. If you
study the layout of that small section of memory you can see exactly
where your input buffer lies in relation to the return address that
you want to overwrite.

That gives you the first piece of the puzzle, exactly how much input
there needs to be and exactly which bytes of it will overwrite the
return address sitting there.

The next piece of the puzzle is what conditions, if any, are put on
the input. Is there any translation being done on the input to keep
you from having perfectly reasonable text, perhaps embedding your own
null character, and following that by a string of binary that writes
over the local variables and fills the needed return address with any
address you care to choose? Or are there limits and/or translation
on the input so you have to get tricky about what possible "addresses"
you can enter, and if so can you find a way to have that reach code
that you want to run.
In any programming I've done where you can write outside of the bounds
of the buffer (char array), you get UNDEFINED behavior, not some magical
power. Even the C and C++ specs state this.

It is undefined BECAUSE the language standard writers could not know
what it was you were going to write over the return address, or over
other local variables. Not because you cannot carefully study this
and perhaps subvert it for your own purpose.
Can someone please explain to me where this comes from. One example I
just read was an IE6 exploit where using a url that's too logn and
contains "unusual" characters can allow a "hacker to run code on the
system." Again, these look liek total bunk to me, as a URL is just text,
and writting past the bound of the buffer just isn't going to give soem
REMOTE hacker the ability to suddenly jump into your system, or some put
code in there.
Can anyone pelase clear this up? If I'm missing something here please
let me know.

I hope that I've done something to explain this.

It is amazing what someone with way too much time on his hands can do.
One of the cutest I ever saw was about 20+ years ago. It ONLY ran on
a Digital Equipment Vax. It ONLY worked on VT-100 compatible terminals.
I believe it was the first Obsfucated C winner, declared by default.
It took the tiniest bit of what looked like scrambled nothing, when it
ran it grabbed your terminal and started a little ascii rendition of a
train chugging across the screen. Staring at that code gave not a clue
how the author had found a way to hijack the C compiler to produce such
a result.
 
S

Saran

Don said:
Ok. (None of this is secret, or being kept from those who want it)

With a single stack processor architecture, the return address for the
current procedure or function is stored on the stack. Below that are
all the local variables for the current procedure or function. If you
study the layout of that small section of memory you can see exactly
where your input buffer lies in relation to the return address that
you want to overwrite.

That gives you the first piece of the puzzle, exactly how much input
there needs to be and exactly which bytes of it will overwrite the
return address sitting there.

The next piece of the puzzle is what conditions, if any, are put on
the input. Is there any translation being done on the input to keep
you from having perfectly reasonable text, perhaps embedding your own
null character, and following that by a string of binary that writes
over the local variables and fills the needed return address with any
address you care to choose? Or are there limits and/or translation
on the input so you have to get tricky about what possible "addresses"
you can enter, and if so can you find a way to have that reach code
that you want to run.


It is undefined BECAUSE the language standard writers could not know
what it was you were going to write over the return address, or over
other local variables. Not because you cannot carefully study this
and perhaps subvert it for your own purpose.



I hope that I've done something to explain this.

It is amazing what someone with way too much time on his hands can do.
One of the cutest I ever saw was about 20+ years ago. It ONLY ran on
a Digital Equipment Vax. It ONLY worked on VT-100 compatible
terminals. I believe it was the first Obsfucated C winner, declared
by default.
It took the tiniest bit of what looked like scrambled nothing, when it
ran it grabbed your terminal and started a little ascii rendition of a
train chugging across the screen. Staring at that code gave not a
clue how the author had found a way to hijack the C compiler to
produce such a result.

Thanks for your explantation. All that makes sense to me (I gues I just
didn't think deep enough. :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top