> Why do we even have linear physical and virtual addresses in the first place, ...

robotresearcher · on June 29, 2022

A stack is a list of objects with a LIFO interface. Doesn't have to be a contiguous byte sequence.

a-dub · on June 29, 2022

is there an example of machine code that doesn't make use of a linear contiguous call stack?

what would the alternative be? compute the size of all the stack frames a-priori in the compiler and then spray them all over main memory and then maintain a linear contiguous list of addresses? doesn't the linear contiguous nature of function call stacks in machine code preserve locality in order to make more efficient use of caches? or would the caches have to become smarter in order to know to preserve "nearby" stack frames when possible?

also, why not just make the addresses wider and put the pid in the high bits? they're already doing this masking stuff for the security descriptors, why not just throw the pid in there as well and be done with it?

Someone · on June 29, 2022

> is there an example of machine code that doesn't make use of a linear contiguous call stack?

Early CPUs didn’t have support for a stack, and some early languages such as COBOL and Fortran didn’t need one. They didn’t allow recursive function calls, so return addresses could be stored at fixed addresses, and a return could either be an indirect jump reading from that address or a direct jump whose target address got modified when writing to that fixed address (see https://people.cs.clemson.edu/~mark/subroutines.html for the history of subroutine calls)

Both go (https://blog.cloudflare.com/how-stacks-are-handled-in-go) and rust (https://mail.mozilla.org/pipermail/rust-dev/2013-November/00...) initially had split stacks (https://releases.llvm.org/3.0/docs/SegmentedStacks.html, https://gcc.gnu.org/wiki/SplitStacks)

abbeyj · on June 30, 2022

Go used to use noncontiguous stacks but they switched away from that model for performance reasons: https://agis.io/post/contiguous-stacks-golang/ .

If you put the pid in the high bits of every pointer you run into problems when you want to support fork(). The child gets a new pid but keeps all the same pointers as its parent.

And that only protects you against one process trying to read/write another process's memory. We already have good support for this type of protection on current hardware/OSes. What is desired is something that prevents a process from reading/writing its own memory, through an out of bounds pointer. This would prevent buffer overflows, stash smashing, etc. and be a good defense against RCE.

wahern · on June 30, 2022

gccgo still uses split stacks, I believe. And AFAIU you can compile C code using -fsplit-stacks, as well.

wahern · on June 30, 2022

Non-contiguous stacks are typically just linked-lists of call frames, or groups or call frames. One of the examples that used to be regularly trotted out comp.lang.c 15+ years ago was an IBM mainframe environments, where stacks were linked lists but still managed by hardware.

Off the top of my head, current examples of non-contiguous stacks on more familiar system with supporting code compiled directly to machine code (i.e. not language interpreters and VMs, where this is actually very common), include

1) gccgo, which compiles Go code to use split stacks.

2) gcc with -fsplit-stacks, which permits compiling C (and C++?) code which uses the same stack discipline as gccgo.

3) OCaml's new concurrency work utilizes non-contiguous stacks. I'm not familiar with OCaml toolchains, but notably they do generate DWARF and related tables so both debuggers and the garbage collector can properly walk logical stacks. See https://arxiv.org/abs/2104.00250

4) There's a stack-smashing and ROP prevention technique call shadow stacks, where a single logical call stack is implemented using two contiguous stacks, one primarily for return addresses (or related metadata) and the other primarily for objects. See https://en.wikipedia.org/wiki/Shadow_stack Both GCC and clang support software-implemented shadow stack disciplines (SafeStacks), and some newer Intel and ARM ISA extensions implement hardware-enforced shadow stacks.

robotresearcher · on June 29, 2022

The linked article doesn't mention call stacks explicitly, but describes the R1000 arch was object+offset addressed in HW. So unless they restricted the call stack to fit into one object and use only offsets, then yes, they must have chained objects together for the stack.

When you have a page-based memory model, you've created the importance of address locality. If you have object-based memory model, and the working set is of objects, not pages, then address locality between objects doesn't matter.

Of course, page-based based memory models are by FAR the most common in practice.

(Note: pages ARE objects, but the objects are significant to the VM system and not to your program. So strictly, page-based models are a corner case of object-based models, where the objects are obscure.)

a-dub · on June 29, 2022

would be interesting to see how the actual call stack is implemented. they must either have a fixed width object as you mention or some kind of linear chaining like you're describing.

found this on wikipedia: https://resources.sei.cmu.edu/asset_files/TechnicalReport/19...

memory and disk are unified into one address space, code is represented by this "diana" structure which can be compressed text, text, ast or machine code. would be curious how procedures are represented in machine code.

what a fascinating machine!

phkamp · on June 30, 2022

More interesting than you can imagine :-)

When you reverse engineer "normal" code, you know the CPU can handle integers of X, 2X, 4X bit widths, you know it wants things aligned this way or that way etc.

The R1000 is bit addressed, and integers can be any width from 1 to 64 bits, structures can be "variant" (See: Ada) so reverse engineering the storage layout is ... not an accomplished feat yet.

Since the primary task for the R1000 was Ada program development, and since there were a standardized semantic representation of Ada programs ("DIANA"), one of the fundamental hardware/microcode data-types is "DianaTree", which is, I think a pretty generic CS tree of some kind.

These "DianaTree" types seem to also have been used for other stuff, like directories in the object store etc.

rsaxvc · on June 30, 2022

One unusual alternative in resource constrained systems is to segment stacks per region of code, not directly per thread.

This allows threads with very small initial stacks to take turns temporarily reserving a larger stack for doing a rare complex operation.

Outside of reserving the stack, it takes a handful of instructions to call and then return from the temporary stack.

fulafel · on July 2, 2022

Fun fact: for many languages a linear memory block doesn't work - because execution context lifetimes aren't strictly FIFO. For example continuations are a language feature that causes this (unless they're restricted eg to single-shot).