So this is a whole lot more complicated these days. There's not one stack but many for the different threads, regions have guard pages typically, and all these regions are setup with mmap (so there's no sbrk syscall anymore) just for starters.
there is a mallopt named M_MMAP_THRESHOLD, in general:
If requested memory is less than it, brk() will be used;
If requested memory is larger than or equals to it, mmap() will be used;
This is linux-specific though. Non-glibc mallocs have different tuning and many have completely foresworn brk/sbrk (openbsd and dragonflybsd mallocs haven't used brk in more than a decade[0]). In fact brk/sbrk is deprecated in every BSD:
On OpenBSD, DragonflyBSD, NetBSD and OSX:
> The brk and sbrk functions are historical curiosities left over from earlier days before the advent of virtual memory management.
On FreeBSD:
> The brk() and sbrk() functions are legacy interfaces from before the advent of modern virtual memory management. They are deprecated and not present on the arm64 or riscv architectures.
It's also somewhat discouraged on Solaris:
> The behavior of brk() and sbrk() is unspecified if an application also uses any other memory functions (such as malloc(3C), mmap(2), free(3C)).
They probably mean that it's no longer used by the allocator which will be using nmap instead (although obviously that will depend on what allocator you use). The syscall itself is going nowhere.
Also very important with 64 bit there is a lot more room to play with. Also there is address layout randomization which at last on 64bit every program _should_ be compiled with.
Also on 64 bit isn't the last memory page like the first unmappable (I think at last on linux it is).
Lastly isn't there a region of unmapable virtual address space on 64 bit in the middle of the virtual address space due to the chips which are doing mmap not handling full 64bit of virtual address space?? I sadly can't find any info about this but I remember having read about it before? Maybe I mixed something up.
> Lastly isn't there a region of unmapable virtual address space on 64 bit in the middle of the virtual address space due to the chips which are doing mmap not handling full 64bit of virtual address space?
Yes. ARM64 currently has a 48b address space split in two ranges (0 to 00007FFFFFFFFFFF and FFFF800000000000 to FFFFFFFFFFFFFFFF), ARMv8 requires 48b and allows 52 but apparently the sizes of the userspace and kernel regions are configurable (though limited by the chip aka you could reduce the size of the kernel space down from 48b but can't increase it).
> As I have said previously, memory is like a huge array with (say) 0xffffffff elements. A pointer in C is an index to this array. Thus when a C pointer is 0xefffe034, it points to the 0xefffe035th element in the memory array (memory being indexed starting with zero).
I'm not sure how true this is outside of a particular platform/compiler. As far as I'm aware, C doesn't actually define how pointers are represented, only that they are a reference to memory (although null is a special case). Pointers in C are very abstract which allows for much more aggressive optimisations.
And all this is before we get into how memory actually works in practice, such as CPU cache lines.
You do have to be able to cast from a pointer to an appropriately-sized integer and back, however [1]. This makes the semantics fuzzy and ill-defined in some cases [2].
If you thought segmented memory was weird, then try something like an 8051 (3-byte "generic" pointers, stored in semi-big-endian order) or other Harvard-architecture microcontroller.
Yes you can implement C in other ways (I've worked on a C JIT that abstracts from this flat memory model, for example) but come on we all know this is how C works on most machines most of the time and they shouldn't need to add a lot of disclaimers that it could theoretically be done a different way when they're just trying to raise awareness of how things work in practice.
The issue is that C does not work that way on modern machines. Not that old Alpha machines had doubleword aligned pointer and no byte or word load instructions. So indexes into the array had to be multiples of 4. More important, aliasing rules preclude treating memory like one big array: https://gist.github.com/shafik/848ae25ee209f698763cffee272a5.... C99 and newer go to some lengths to permit the optimizer for treat pointers as pointing into disjoint byte ranges (which allows the optimizer to assume they cannot alias). Accordingly the mental model of a big array of memory is, at least for C, generally unsound.
> Unfortunately, you cannot access all elements of memory
That's rather fortunate, instead. Programming (esp. in C) under an operating system/hardware architecture that does not provide this protection is a real pain. Memory protection is a feature that is meant primarily to help developers (to say nothing about security).
As it turns out, the first 8 pages on our hydra machines are void. This means that trying to read to or write from any address from 0 to 0xffff will result in a segmentation violation.
I have similar issue. I was following "A Whirlwind Tutorial on Creating Really Teensy ELF Executables for Linux"[0]. And decided to put the start of .text section at virtual address 0x0:
; tiny.asm
BITS 32
org 0x0
;
; (the same as the one in the teensy elf tutorial)
;
It results in segmentation fault when ran as normal user. But fine when ran as super user. Changing the code to use address 0x10000 fix the problem.
My question: Is my issue because I create an elf that has .text section inside that void region? Is this void region documented somewhere? What purpose does it serve?
Fun trick you can do, at least on windows, it to append your data to the program's file with the offset as the very last thing you write so that it's easy for it to find. I've run various programs like this through virus scanners and the only type to false flag it were the "neural network ai" scanners. So that shouldn't be a problem.