Different stacks for return addresses and data?

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View


I wondered whether it would be possible to use different stacks for
return addresses and data to eliminate the harm of buffer overflows
except in unlikely circumstances (e.g. if you have function pointers
saved in data structures)

I don't think you even had the need to make x86-incompatible
processors; you just would have to specify alternatives to push and
pop writing to a dedicated data stack...

Does anything speak against this?



Re: Different stacks for return addresses and data?

Joris Dolderer wrote:
Quoted text here. Click to load it

All sorts of things are possible. You could implement multiple stacks.
You could dispense with contiguous stacks altogether and store frames
in noncontiguous activation records; and you could allocate two
records for each frame, one for return address and one for data, in
separate areas. You could write frames to disk rather than keeping
them in memory at all, or dump 'em in an online database, or store
them in mercury delay tubes, or bounce them off the moon with a laser.
You could print out return addresses and make a user type them back in
as needed.

There's a bit of a gap between "possible" and "useful".

Quoted text here. Click to load it

It's not the processor (though you'd still have the conventional stack
used for things like interrupt servicing). It's the ABI. When every
piece of code running on the system uses a particular calling
mechanism, making radical changes to it will be a bit of work.

More practical - though of dubious utility - would be something like
this: create a language that uses the normal stack, but puts only two
words in each stack frame, the return address and a pointer to an
activation record for parameters and other bindings ("stack"-allocated
variables and the like).

This is easier if you fix the size of the record for a particular
function at compilation time, as C did prior to C99. That way you
don't have to worry about extending the activation record or chaining
additional records.

That would make it harder to overwrite the return address by
corrupting data in the activation frame.

But this would only apply to calls to functions in your own language.
You'd have to provide the normal ABI for calling functions not written
in your language. More importantly, there would still be
stack-smashing vulnerabilities in those functions, so all you've done
is reduce the attack surface a bit.

Even in functions with this sort of separation, there can be overflow
vulnerabilities that let an attacker vector to arbitrary code.
Consider function-pointer parameters (or equivalent), for example:
overflow from some other data in the activation record and overwrite
then function pointer before it's called.

I think this was discussed at more length some years back on VULN-DEV
or a similar list.

Quoted text here. Click to load it

Performance, mostly. Many processors provide a contiguous stack with
push/pop opcodes, or similar, because it's very quick. Creating
activation records is relatively expensive.

The performance advantage of a contiguous stack is great enough that
it beat out noncontiguous designs even though it enforces a strict
FIFO ordering on control transfer. Discrete activation records make it
much easier to implement things like coroutines. As it is, thanks to
contiguous stacks we now have the disaster that is widespread use of
preemptive multithreading - a mess created mostly to permit multiple
stacks and get around the limitations of contiguous stacks.

And now, of course, compatibility.

As a side note, it's even more efficient to avoid any main-memory
stack and just use registers, where possible. However, attempts to
leverage lots of registers for subroutine calls - notably register
windows - have run into other issues.

Another aside: another approach to fixing stack-smashing and similar
problems is to implement a capability architecture, which enforces
object access at the CPU level. The most commercially successful
capability architecture is IBM's AS/400 / System i family. Intel
brought out a capability CPU, the 432, in the early '80s but it was a
commercial failure.

Michael Wojcik
Micro Focus
Rhetoric & Writing, Michigan State University

Re: Different stacks for return addresses and data?

Quoted text here. Click to load it

"possible", yes.  there are practical difficulties.

Quoted text here. Click to load it

At the hardware level, its simply a matter of redefining the function
CALL and RETURN codes to use a 2nd stack-pointer register.  Of course
to do that, you also haveto have that 2nd stack-pointer register.

On the _software_ side things get 'complicated'.  How much space do you
alloceate for each of the two stacks, and where do you put it.  And,
whad to you do about the 'heap'.   _TWO_ variable-size things are easy
to handle -- start one at each end of he 'free space', and let the grow
towards each other.  *THREE* such things is an entirely different kettle
of fish.

Lastly, the [return-address stack-pointer] _has_ to be accessible by
general CPU operations -- to wit so one can *set* the base of the stack
initially.  That means that a malicous program can retrieve the return-
address stack-pointer address, and 'smash' the return address by writing
to the address pointed-to by the stack pointer.

Any more questions?   <grin>

Re: Different stacks for return addresses and data?

In article
 bonomi@host122.r-bonomi.com (Robert Bonomi) wrote:

Quoted text here. Click to load it

This was solved decdes ago: use segmented memory instead of a linear
address space.  One segment for the IC stack, one for the data stack,
and other segments for the heap.

Quoted text here. Click to load it

You can't protect against a program trying to break itself.  But the
point is to protect against malicious DATA.

Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***

Re: Different stacks for return addresses and data?

Quoted text here. Click to load it
Quoted text here. Click to load it

With large enough a linear address space this can be easily simulated:
Just put the start of the return stack and the data stack one "segment
size" apart. On Linux and HP-UX, the stack is by default limited to 8MB
(other Unixes were similar, last time I looked).  This is tiny even
compared to a 32 bit address space, so reserving 2 areas of 8MB instead
of one wouldn't significantly hamper the growth of the heap. Now think
of 64bit machines ...

The biggest problem is IMHO that a change like this completely breaks
the ABI: You have to recompile everything, so basically you can only do
it on a new platform.


Site Timeline