Hacker News new | past | comments | ask | show | jobs | submit login
Can a local variable's memory be accessed outside its scope? (stackoverflow.com)
89 points by danso on Dec 2, 2013 | hide | past | favorite | 39 comments



The original question asked about C++, but in the old days of assembly language, accessing "out of scope" data from the stack was fairly useful.

For example, Apple II computers had a series of slots into which expansion cards could be inserted to add new features to the computer. These cards, as you might expect, often needed controller software, and this was supplied via on-card ROM. The code in this ROM, by convention, was mapped into memory at address $Cx00, where x was the slot number the user happened to choose for the card. As a result, when this code was called, it had no idea where it resided in memory and, consequently, which slot its card was in. To figure these things out, the controller code used the following trick:

    JSR IORTS
    TSX
    LDA $100,X
The JSR instruction makes a subroutine call, which causes the address of the next instruction to be pushed onto the system stack and then transfers control to the subroutine, in this case a well-known subroutine in system ROM called IORTS. This routine, as its name implies, is just an RTS instruction, so it returns immediately back to its caller. Control thus returns to the second instruction of our code, which copies the stack pointer into the X register. The third instruction then uses this pointer to read above the top of the system stack to obtain the value that was there during the call to IORTS. This value, of course, was the most-significant byte of the return address pushed onto the stack by the JSR instruction. Now the code knows where it resides.

If you want to see a prime example of this technique in some famous code hand-assembled by Steve Wozniak 35 years ago, take a look at page 22 of the following PDF file. It contains the boot code for the Apple Disk II controller card:

https://s3.amazonaws.com/s3data.computerhistory.org/atchm/do...


Note that this was necessary, because on some processors you cannot directly access the address of the program counter, so as this stackoverflow answer explains...

http://stackoverflow.com/a/7932364/419237

the canonical way to get it on x86 is to jump to a subroutine right in front of you, and pop the return address of the stack, instead of "ret" popping the return-address and, well... returning.

But actually you can translate the 6502 solution directly to x86 assembler (gcc on Linux):

#include <stdio.h>

int main(int argc, char argv) { void p;

		(void) argc;
		(void) argv;

		printf("main is at %p\n",&main);

		__asm__(
			"	jmp 1f\n"            /* that's the IORTS */
			"0:     ret\n"               /* subroutine */
			"1:     call 0b\n"           /* call IORTS */
			"       sub $0x4,%%esp\n"    /* decrease stackpointer */
			"       pop %0\n": "=r"(p)); /* pop register */

		printf("PC = %p\n",p);
		return 0;
	}


From what I recall, that's not necessarily safe on modern OSes and processors due to interrupts, but there's still buggy code out there that relies on it.


I can't think of a modern OS/ISA combination where that's the case. State will get saved on a kernel mode stack (if not somewhere else totally). This is because the stack pointer is controlled by user mode, and you don't want it to do something like set the stack pointer to some area in kernel space, and then invoke a software interrupt in order to overwrite privileged code or data.


Maybe in modern OS/CPU combos, but on a 70s era 8 bit CPU like the 6502 the code as described above is definitely vulnerable to a hardware interrupt overwriting the out of scope memory area in question before it is used. One way to make it safe would be to disable interrupts, although non-maskable interrupts would still be a problem.

However, although I haven't checked the details, I suspect that something much cleverer does make the trick robust. What would the hardware interrupt service routine overwrite the out of scope memory with ? For many CPUs, maybe for the 6502, it would be the return address from the interrupt service routine - which is slightly different to the originally pushed return address, but still on the same page, which is all that's required. So a little bit of Woz magic perhaps.


Every programmer that uses a language with C's memory model (and preferably absolutely every programmer) should know the following by heart:

    |-------| - max address
    |STACK ↓|
    |  SP   | ← stack pointer
    | ..... |
    | ..... |
    |HEAP  ↑|
    |-------|
    |other  |
    |mapped |
    |memory |
    |-------| - zero
Barring alternate memory managers (if you use one, you know the diagram and are now writing a post why it's wrong), stack grows down, heap grows up. When you call a function, SP is decremented by the total size of the stack variables in the called function, the address of the next instruction in the current function is written at SP, and each variable in the called function is written at SP - x, where x is an offset calculated by the compiler. When the function returns, the memory isn't cleared, the address of where we left off the caller is read, SP is incremented to its previous value, and the processor resumes from that point. The "push" and "pop" cpu instructions don't allocate memory, they're just a shorthand to decrement SP and copy a value.

For a fun demonstration, compile this C code with -O0 (optimizations off, or debug build in Visual Studio, IIRC):

    #include <stdio.h>

    int foo(int unused) {
        int a;
        return a;
    }

    int bar(int x) {
        int b;
        b = x;
        return b;
    }

    int main(int argc, char** argv) {
        printf("%d\n", foo());
        bar(10);
        printf("%d\n", foo());
        return 0;
    }


Not always. Stacks can grow up in some CPUs. http://stackoverflow.com/questions/664744/what-is-the-direct...

And then there's Itanium, which has two separate stacks, one growing up and one growing down...


Kinda, but not exactly, same kind of issue that is prevalent when learning private/protected/public (or the equivalents). They do not actually prevent code from outside accessing the bits you specify, like what you'd except if you think them like being similar to file permission bits. Rather they are hints for the type system and usually invisible at runtime.

In the same way, scopes do not actively prevent code from outside accessing the variables within. Rather they are hints for the compiler that these variables are associated with these bits of code, and usually on runtime they are just organized in most efficient way rather than enforcing the scope boundaries.


Bonus question: can you get the example to print anything other than 5? The fact that the example prints 5 is an artifact of having allocated "a" on the stack. However, the compiler does not necessarily have to do that. If the compiler can determine that the pointee of "p" is undefined, you should be able to get it to print the value of whatever was previously in the register allocated for the pointee of "p" (which will probably be RSI on an x86-64 machine since IIRC the "cout" statement gets turned into a two-argument function call, the second of which is passed in RSI).

I'm not sure if any compiler will actually do this, though. They might just see that the address of "a" is taken at some point and refuse to allocate it to a register.


If you enable optimisation, you probably would see something other than 5.


int* bar(void) { int a = 7; return &a; }

int* p = foo(); int* q = bar(); printf("%d\n", *p); //7 (at least for most ABIs that I have seen in the wild)


Do people really find extended analogies like this helpful? If you have even a vague understanding of how computers work, it just seems confusing to me.


The pleasure for me in reading this extended analogy was making the parallels between the analogy and the computer equivalent, as I read.

For example, "someone might have replaced the nightstand by an armoire" (e.g., what used to be a Bar object is now a Foo), and "someone might be tearing up the book just as you walked in" (e.g., an asynchronous process may be destroying the object piece by piece, concurrently with your own execution).


My short explanation: every time you call a function, its vars get allocated on the stack. When you return from the function, you "pop" the stack, but all that means is that the stack pointer is now pointing to main. There is no reason to actually clear the stack memory as that would be a waste of CPU cycles. Therefore, if main() calls foo(), then foo() returns, the contents of the variables of foo() is still one frame higher on the frame stack and given the right memory address you can still access it.

As others point out, this is not something you should rely on. On the other hand if you are trying to overflow the stack or somehow break the program, this is definitely something to try.

The converse of this is that local variables of main() can be made accessibly to foo():

    void foo(int *x) {
        printf("a = %d\n", *x);
    }

    void main() {
        int a = 12;
        foo(&a);
    }
This of course makes sense: when you are in the middle of foo(), main()'s variables have to be stored somewhere and are accessible.

Edit: here's another fun way to get a from foo():

    void foo() {
        int *x;
        x = (int *) (&x + sizeof(int));
        printf("a = %d\n", *x);
    }

    int main() {
        int a = 12;
        foo();
        return 0;
    }


Therefore, if main() calls foo(), then foo() returns, the contents of the variables of foo() is still one frame higher on the frame stack and given the right memory address you can still access it.

Unless an interrupt occurs between returning from foo() and reading those leftover stack variables. foo()'s abandoned stack space gets smashed by at least the return address from the interrupt handler, plus anything the handler itself pushes.

This falls into the class of use-after-free bugs. Like most instances of such, the technique works until something makes it not do so.


Great point! Yes, that's exactly what this is: use-after-free. That's why I am saying that this is useful when you are trying to somehow break the program, but not useful when you are doing constructive things. Presumably, when you are trying to break the program, you can run it multiple times, and chances are that at some point the interrupt will not happen.


"every time you call a function, its vars get allocated on the stack."

That's 'may get allocated'.

Also, even if you compile with -OnegativeInfinity, your 'another fun way' won't do what you think it does on architectures that store return addresses and locals on the same stack (= most CPUs that mortals will program) or that grow the stack 'up' (rare)


Interesting. I can't find any reference to negativeInfinity and what that would do. Can you explain further?

Also, like I said, that's a "fun" way, as in a giant hack not meant to ever be written outside of just messing around.


That -OnegativeInfinity is a pun. With many compiler chains, -O0 (hyphen-oh-zero) does 'no' optimization, -O1 does a bit more, -O2 some more, -O3 even more. That 'no' is on quotes because there is no such thing as 'no optimization'. For example, given enough locals, a compiler must do register allocation.

The hypothetical -OnegativeInfinity would truly do no optimization.


No. The code in question is invoking undefined behavour, making the question pointless.


It is often instructive and useful to look at the behavior of real implementations, not just the idealized behavior given by the standard.


Indeed - sometimes you run into something and think "that shouldn't have worked", and it's useful to understand why.


When compiled with -O3 using g++ it returns 0 8. Compiler will use undefined behavior to optimize.


There's a fun example with clang where you can have two pointers where:

    x == y
But:

    *x != *y
It involves invoking undefined behavior by using a pointer after it's been freed and arranging for a new pointer at the exact same location. Clang cached the contents of the old pointer and uses it in the comparison.


Then the answer is "yes and no".


"yes and no" is a poor way of putting anything. I prefer a nice clear statement like:

You can access a local variable from outside its scope if and only if your code is not actually valid C++.


Valid C++ may have undefined behavior.


I think there is a difference between syntactically valid and semantically valid.

UB always puts you in the semantically invalid category.


Compiler has the freedom to use assumption that the code will never hit undefined behavior. To actually access the variable you have to hit undefined behavior.


Actually the compiler (well, the author thereof) must know that the code may very well hit undefined behavior, and address how such known situations should be handled.

From the C++ spec (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n190...):

1.3.13 undefined behavior [defns.undefined]

behavior, such as might arise upon use of an erroneous program construct or erroneous data, for which this International Standard imposes no requirements. Undefined behavior may also be expected when this International Standard omits the description of any explicit definition of behavior. [ Note: permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message). Many erroneous program constructs do not engender undefined behavior; they are required to be diagnosed. —end note ]

1.3.14 unspecified behavior [defns.unspecified] behavior, for a well-formed program construct and correct data, that depends on the implementation. The implementation is not required to document which behavior occurs.

The spec goes on to explicitly state, over 150 times, certain situations result in undefined behavior; my favorite being:

Using a bool value in ways described by this International Standard as “undefined,” such as by examining the value of an uninitialized automatic variable, might cause it to behave as if it is neither true nor false.


...and by extension, many buffer-overflow security exploits do not actually exist.


One thing it depends on, that I don't think anyone has mentioned, is the compiler's register allocation logic and how that interacts with the rest of the code in the calling function.

If, at the point you call the function, all registers are used, the following statement to retrieve the value may cause a register to be spilled to the stack. That may overwrite the data. That's more likely to happen on a register-poor architecture like x86.


Short answer: maybe, but you shouldn't depend on doing that. Most compilers will even warn you about doing this exact thing, depending on their settings.


The stack is a cheap/fast memory allocator. Upon return (out of scope for local variables) its freed. Don't point to freed memory.


My instincts are that it's in stack memory, not heap, so there won't necessarily be a violation, and as you're not overwriting with anything (or making a new stack frame), sure the memory's still there and unaltered.

You can write all over your stack if you like. It's a pretty bad idea though.


Yeah, it looks to me like it should work up until you make another function call. Of course, that's assuming the platonic ideal of a memory model, which may not actually hold after the compiler, linker, loader, and memory paging have all had their way with it.


Yes, if you're an idiot. No, if you're not.

EDIT: okay, I'm getting downvoted. But the truth of the matter is, its highly dependent on if the developer understands what they're doing. In both the cases where the developer knows, or the developer doesn't know, either way - accessing variables off the stack in this way, with a flying pointer, is pretty idiotic. Its not going to result in great software, people ..


You're getting downvoted because:

1) It's not a great insight in the first place. It's common sense and well known that you shouldn't access data this way.

2) It doesn't answer the question, which is about the mechanics (how does it happen?), not the quality of the programmer (what kind of programmer does that?)

3) It's needlessly insulting.

4) It's plainly wrong. It might be "idiotic" to access memory this way, but that doesn't mean that only idiots do it, either with intent (e.g to hack into something, or speed up some code with pointer arithmetic), or without (e.g by accident). It can happen (and it has happened) even to Dennis Ritchie.


Not Kernighan though. He'd never make a mistake like that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: