Memory Management Every Programmer Should Know

saagarjha · 2024-04-30T08:52:22 1714467142

Have to admit I'm a little confused by this blog post. I think the target audience is people working in managed languages (e.g. on the web?), especially given the part about primitive types and objects in the intro. In that case, "stack" and "heap" are kind of different from what the native stack and heap are. Often your (runtime-allocated) stack will contain just data and be in the heap. However, if you read the rest of the blog post it is clearly talking about native code. (Though, I will note that not just GPRs get spilled on the stack: it also contains locals.) I guess this could be useful to introduce some terms that people use when talking about memory management, but there's no tie to "my language", because it doesn't really pick one. So it's difficult to generalize from what it talks about or put it to use because languages do things differently.

leononame · 2024-04-30T09:22:43 1714468963

Isn't that stack/heap distinction between managed and native code just a technicality? I mean, for native code, your stack is also just allocated space in memory, right?

Similarly, your stack in a managed language is allocated on the heap of the runtime itself, sure. But in the end, once the managed code starts running, for the code itself it's a stack. Or am I mistaken and there is some special handling of stack memory vs heap memory? As far as I understand, the stack is just a big preallocated block of memory that we start filling up.

I agree that the rest of the article is pretty thin, though. It goes into managed code mentioning a tracing GC and reference counting, but doesn't even really explain what it is and doesn't explain memory management at all.

I would've expected different allocation strategies like arenas, but this is disappointing

gpderetta · 2024-04-30T10:50:26 1714474226

The difference is in the name. Stack allocation is strictly FIFO [1], so it can be implemented very efficiently with just incrementing and decrementing a stack pointer, even on architectures that don't directly have an hardware stack. In principle that stack for local variables doesn't even need to be the same stack as the callstack. It is also strictly thread local.

On the other hand an heap (which has nothing to do with the heap data structure) is much more complex.

[1] At least across activation frames, of course a compiler can statically generate code that reuses frame slots in a non-FIFO way.

bregma · 2024-04-30T10:33:42 1714473222

Yes, memory is memory in the same sense that exponentiation is just repeated addition of 1 to number. The article isn't really talking about memory so much as discussing storage, which is often colloquially called 'memory' by people who do not deal with it directly.

What the article is trying to do is to distinguish between scoped storage and non-scoped storage. Those are two of the 4 types of storage used in a modern programming language abstract machine. All of the 4 types of storage have backing memory allocated in different ways and at different times an initialized through different sequences, but also on modern processors are usually processed in registers, which are synchronized to their backing store through various levels of cache in different ways as well. The article is a very simplistic introduction to basic abstract storage concepts targeted at a complete beginner.

I would have expected an article about memory management to discuss the different types of memory, or maybe cacheing, or about how memory allocation and garbage collection works. But it's fine as a very high-level introduction to how storage works in a typical programming language abstract machine.

weinzierl · 2024-04-30T09:40:58 1714470058

"I mean, for native code, your stack is also just allocated space in memory, right?"

Yes. At least on Linux the thread stacks and the memory on the heap are usually allocated with the same syscall (mmap) even. So, the difference is not a question of provenance but how the memory is managed. The native stack is the one the (hardware register) stack pointer points to and there can only be one per thread.

Since this stack's management is hardware supported it can be very fast and efficient. Of course, nothing prevents you from having as many stacks on the heap as you like and manage them yourself. From all I know this is not what managed languages do, but I think they still rely on the amenities of the hardware supported stack.

evgpbfhnr · 2024-04-30T10:31:00 1714473060

> Therefore, data whose size cannot be determined at compile time or whose size can be changed cannot be safely placed on the stack.

"someone is wrong on the internet" -- afaik VLA (variable length arrays) can go on the stack just fine, at the very least stuff like alloca is just that. It's just a matter of incrementing the stack pointer by a variable length before calling the next function.

The problem with stack and large objects is that a program has no standard way of knowing how large the stack is, at some point if you grow the stack too much you'll start hitting another memory region used for something else (another thread's stack, some library mapping, or the heap); and alloca failure mode is basically just "bad things happen" so it's not recommended... But it's definitely not a "known at compile time" problem.

gpderetta · 2024-04-30T10:39:10 1714473550

key being "safely placed". VLAs and allocas keep being deprecated everywhere as they are a time bomb.

In practice you need safe upperbound for VLAs and alloca, so you might as well just allocate the upper bound as a fix size array outright. If you do not initialize the array, there is no real additional cost other than the additional stack probes. On GCC at least, alloca and VLAs disable some optimizations (like inlining) so it is really a wash.

If you need truly unbounded size allocations, you need an heap fallback anyway.

Seb-C · 2024-04-30T11:19:50 1714475990

In my experience, most language developers don't even know how arrays/slices/lists works, or that they are reallocating stuff all over the place in ridiculously inefficient ways. They don't even know the difference between length and capacity.

It's nice to try to create awareness however, but the heap/stack is not the first thing that would come to my mind tbh.

brabel · 2024-04-30T11:27:37 1714476457

> They don't even know the difference between length and capacity.

High level languages go out of the way to make sure you don't need to worry about this stuff, so it's natural people just don't. And a lot of the time, the language runtime applies to many optimisations that you can't really know what will happen (and even if you know, that can probably change as the language may not guarantee the behaviour will never change).

If you do JS/Python/Ruby and similar languages, there's zero reason to know the difference between heap and stack. In fact, I think most of the time what may look like a heap allocation may be optimised by the JIT to become a simple variable on the stack!

But of course, if you want to up your game and do more low level stuff that can squeeze the most of performance from the hardware, it's important to learn that stuff. But you probably need a language that enables you to do that. Rust, C, D, Nim etc. all allow you to tell the compiler exactly how to manage memory. Which is a blessing or a curse depending on whether you care about it or not.

Languages like Java and C# are right in the middle: they do let you manage some stuff (e.g. ArrayList capacity that you mention) while hiding others (Java doesn't let you directly ask for some reference type to go on the stack but does escape analysis to try really hard to put your Object memory on the stack for you anyway).

tialaramex · 2024-04-30T12:12:25 1714479145

It makes sense to potentially be ignorant of capacity because it's just an optimisation. Your growable array type (the thing most likely to have capacity because it will definitely have the amortized growth strategy, even Python doesn't get that wrong) doesn't need to even reveal this feature to you†, whereas length is an actual property of the container.

† It should because even a very high level programmer can significantly impact performance by providing useful hints here, but it doesn't have to. And hey C++ programmers think they're writing low level code and their hint API is rubbish (and the language's inventor says you shouldn't bother providing hints as a result) so apparently you're in good company.

It's actually interesting where capacity is used, and of those places where it's available to the programmer to influence. In principle a lot of types could benefit from smarter growth strategies, but often only the growable array type (and maybe a string type if your language at least knows those aren't just growable arrays) is given a suitable API.

For example even crappy 1980s-style hash tables could avoid allocation overhead with a pre-allocation scheme based on size hints, ie knowing capacity. All the Open addressed hash tables can do even better here because it's a natural fit for them.

tralarpa · 2024-04-30T12:37:59 1714480679

> Java doesn't let you directly ask for some reference type to go on the stack but does escape analysis to try really hard to put your Object memory on the stack for you anyway

Is that a new hotspot feature? I thought it can only do scalar replacement (avoiding object allocation and keeping the object fields in registers).

hibrandonevans · 2024-05-02T03:45:00 1714621500

Indeed, capacity optimization can be helpful for improving performance, although it is often not prominent in many high-level programming languages.

mendelab · 2024-04-30T11:23:17 1714476197

If the compiler would know the stack size, we wouldn't have StackOverflow

vbezhenar · 2024-04-30T11:29:58 1714476598

Is it hard to implement? It must be forbidden to have loops in the potential call graph. So, basically no recursion, direct or indirect. In this case, compiler with global visibility can calculate maximum call stack depth.

I mean, it doesn't sound particularly hard technically, but is it too limiting for developer?

Liquid_Fire · 2024-04-30T12:16:40 1714479400

It's not so easy for the compiler if you use virtual functions (dynamic dispatch), dynamic linking, function pointers etc as the complete call graph might not be known at compile time.