A visual guide to the Go memory allocator from scratch

jfoutz · on Feb 21, 2019

The last few days i've been struggling with how go decides heap vs stack. Coming from a c/c++/java background where it's super clear, malloc or new goes on the heap everything else goes on the stack, i've found go a little frustrating. i still don't have a good sense of when the escape analysis will fail and things get pushed to the heap.

This article isn't about any of that. This is much lower level, and will give you insight about other languages approaches to growing addressable space. i can't say i'm thrilled with the interleaved linked lists, but it's pretty neat and works.

i don't write assembly, and i don't think real hard about how virtual address space is managed. but maybe someday it'll be important and having read this article some bells will ring. the article is cool and worth a few minutes of your time.

edit interleaved linked lists isn't the right term. i don't think it's _wrong_ but there's much more complexity with fitting the arenas. don't listen to me, i'm a fool.

masklinn · on Feb 21, 2019

> Coming from a c++ background where it's super clear, new goes on the heap everything else goes on the stack

AFAIK modern C++ practices eschew new/delete pretty completely, and while std::unique_ptr or std::vector will create something on the stack it's a glorified pointer, the actual data lives on the heap.

Also probably doesn't help with the specific criticism but FWIW the `-m` build flag should print out heap escape. I don't know that there are editors which integrate this though. And of course there are cases where you want the escape and heap allocation, tough one possibility would be to ban escape analysis and use `new` when you want to force a heap allocation, maybe?

edit: -m is pretty noisy though, lots of constructs will cause non-bypassable escapes e.g. the fmt functions will "escape" their format string and every single parameter: https://github.com/golang/go/issues/8618

jfoutz · on Feb 21, 2019

happy to admit my c++ experience is very, very old. I totally understand that things have changed. But even back in the day, libraries could do stuff that hid 'new' from me. Heck, java libs will call 'new' all the time. the same goes for c/malloc

I can cope with accepting a library that does god knows what. but my code, i can see stack/heap split. perhaps it's a security blanket. perhaps it's pointless.

this insight is actually really helpful. in some cases i care, in some cases i turn a blind eye. maybe writing good go means i turn a blind eye all the time. unless i need to inspect the escapes - which go provides good tools for. _chin scratch_

Thanks.

masklinn · on Feb 21, 2019

> happy to admit my c++ experience is very, very old. I totally understand that things have changed. But even back in the day, libraries could do stuff that hid 'new' from me. Heck, java libs will call 'new' all the time. the same goes for c/malloc

Yeah, Java necessarily `news` stuff, I believe it's the only way to instantiate classes (ignoring factories).

pjmlp · on Feb 21, 2019

Calling new on C++ does not mean you allocate on the heap, rather that the heap might be used.

Placement new, global new handler and allocators change that behavior.

Additionally, following the footsteps of heap heavy languages, C++ optimizers also do escape analysis and even without overriding new's default behavior, the stack might be used instead if the types are proven not to escape scope.

jfoutz · on Feb 21, 2019

Even factories. Java puts everything on the heap. There's no on the stack object.

Java and go are both pass by value. Java makes it easy, because i know everything is a pointer to a thing. I (unreasonably?) freak out about go because i don't understand when an array backing a slice change. I'm just chalking it up to me being old and dumb. Seems hard to track, but tons of people do it, so the issue must be me.

masklinn · on Feb 21, 2019

> Even factories.

I meant "ignoring factories" in the sense of "practically you don't new to get an instance out of a factory, but ultimately it does that internally so let's just ignore this as it's just nit-picking".

> I (unreasonably?) freak out about go because i don't understand when an array backing a slice change.

That seems like the simplest one ever: a slice is a super shitty version of an ArrayList[0], so the slice is modified on assignment (direct or via the slice), and can be replaced implicitly on reallocation. Same as the elementData member of an ArrayList.

[0] because the backing buffer can trivially be shared between slices, which is not the case for ArrayList

pjmlp · on Feb 21, 2019

Sure there is for primitive types, for objects the compiler gets to decide what lands on the stack given escape analysis, until Project Panama finally gets integrated.

Additionally both IBM and Azul JVMs do have extensions for stack allocations.

MrBuddyCasino · on Feb 21, 2019

Exactly. Also unlike C++, allocation is nearly free on the JVM, and garbage collection is cheap for short-lived ones.

masklinn · on Feb 21, 2019

A nearly free allocation is still more expensive than being able to not allocate at all, and the nearly free allocation will also add to the GC's workload.

pjmlp · on Feb 21, 2019

Depends pretty much on which GC algorithm is being used.

Skinney · on Feb 21, 2019

Java does do escape analysis to allocate certain things on the stack.

pjmlp · on Feb 21, 2019

It is not clear at all in Java, variables get stack allocate via escape analysis.

The extent this is made depends on what JVM is being used.

sly010 · on Feb 21, 2019

Because of the way brk works it is not possible to have 2 different memory allocators to use it in the same process.

Since malloc() in libc uses brk, no language with C bindings can use brk.

Does this mean basically most allocators other than libc malloc() use mmap()?

oddity · on Feb 21, 2019

I'm surprised that anything still uses brk. Mentally, I've thrown it in the same pile as fork or gets, but it's been a while since I've written a OS-level memory allocator, so maybe my assumptions are incorrect.

zbentley · on Feb 21, 2019

> Mentally, I've thrown it in the same pile as fork or gets

Why is fork included there? It's extremely useful in a wide variety of modern applications. Or are you referring to specific libc functions rather than the system call?

oddity · on March 3, 2019

Comparing fork to gets is perhaps a bit unfair to fork. I was being lazy when I tried to suggest a spectrum of deservedly rare api calls. :) It’s not broken like gets or outdated like sbrk, but it doesn’t compose well with other parts of the system and there are enough caveats that I recommend more use-case targeted alternatives wherever possible.

For example: http://www.evanjones.ca/fork-is-dangerous.html

masklinn · on Feb 21, 2019

> Since malloc() in libc uses brk, no language with C bindings can use brk.

Afaik there is no such thing as "malloc in libc", each libc has its own allocator which does its own thing, OpenBSD's hasn't used (s)brk in 15 years and Dragonfly pulled that in way back[0]. And freebsd's system allocator is jemalloc which can use mmap or sbrk (if the latter is available and compiled in) but will prefer mmap (unless otherwise configured, though that knob may have been removed since I last checked):

> Traditionally, allocators have used sbrk(2) to obtain memory, which is suboptimal for several reasons, including race conditions, increased fragmentation, and artificial limitations on maximum usable memory. If sbrk(2) is supported by the operating system, this allocator uses both mmap(2) and sbrk(2), in that order of preference; otherwise only mmap(2) is used.

More broadly, (s)brk is considered deprecated in most if not all BSDs

* OpenBSD: The brk() and sbrk() functions are historical curiosities left over from earlier days before the advent of virtual memory management.

* FreeBSD: The brk() and sbrk() functions are legacy interfaces from before the advent of modern virtual memory management. They are deprecated and not present on the arm64 or riscv architectures. The mmap(2) interface should be used to allocate pages instead.

* macOS: The brk and sbrk functions are historical curiosities left over from earlier days before the advent of virtual memory management.

* NetBSD: The brk and sbrk functions are legacy interfaces from before the advent of modern virtual memory management.

And while glibc might use (s)brk, it hopefully doesn't bother releasing the (s)brk allocations as that would be a significant amount of work for usually no ability to release anything anyway (unless it's an arena-type use), meaning others can probably (s)brk as well, as long as they don't assume they're the only caller.

An other point of interest: on Solaris[1] and its descendants ([2], [3]) not only is mmap(2) recommended, the manpage plain tells you

> The behavior of brk() and sbrk() is unspecified if an application also uses any other memory functions (such as malloc(3C), mmap(2), free(3C)).

[0] https://bugs.dragonflybsd.org/issues/84

[1] https://docs.oracle.com/cd/E88353_01/html/E37841/sbrk-2.html

[2] https://illumos.org/man/2/sbrk

[3] https://smartos.org/man/2/sbrk