How to make sure no dynamic memory is used

girvo · on Nov 8, 2022

One annoyance with ESP-IDF is its use of malloc internally, which means our firmware code which needs long uptimes has to be super careful about how we’re approaching things. Our code is in Nim, so it’s basically a matter of not using seq/string/ref T unless we absolutely need to (which is never, so far), but we’re still seeing some memory fragmentation and leaks with N days uptime on a busy system.

I think I need to go tweak LwIP and rip out esp_modem for a custom PPPoS implementation — as currently it leaks memory every socket creation (and we’re not the only ones to hit it)

Avamander · on Nov 8, 2022

Proprietary stacks in general tend to suck in that aspect - lack of visibility and control shows itself especially sharply. Be it Espressif, Nordic or any other.

cozzyd · on Nov 8, 2022

Microchip's (formerly atmel's) asf4 is shitty in a lot of ways but doesn't dynamically allocate. Of course espressifs mcus are much bigger...

girvo · on Nov 8, 2022

Though at least it's "open" in terms of open source, so we can (and have) hacked at some of the bits we've had to use to remove some of the malloc usage, but you're right.

I do wish we could've gone to Zephyr instead, it has real promise IMO. Though I'm biased because I prefer the Linux kernel-like approach haha

alright2565 · on Nov 7, 2022

I've poked around with -fsanitize=kernel-address in embedded environments--and I think there's real potential there.

However, in this case, I wonder how well -fsanitize=leak might work. Override free() to show when it is called & print a stack trace, and you get a good look at your system's actual allocation pattern.

As an aside, there's nothing wrong with malloc in these embedded environments, it's the freeing that causes memory fragmentation. If you can also avoid running out of memory, you're good to go, since you essentially have a fancy form of static allocation!

a1369209993 · on Nov 7, 2022

> it's [just] the freeing that causes memory fragmentation.

Nope. Consider for examle a microcontroller with 64K of address space, consisting of 32K of low RAM, 8K of ROM/MMIO, and 24K of high RAM. Allocate buffers of size 23K, 15K, and 15K. With a straightforward first-fit allocator[0], you'll put the 23K in low RAM, leaving 9K, put the first 15K in high RAM, leaving 9K there, and fail to allocate the second 15K, because even though you have 18K free, it's fragmented into two 9K regions.

If you'd allocated that memory statically, you could have put the 23K in high RAM, and the two 15Ks in low RAM, with 2K low and 1K high left over.

It's debatable whether the fragmentation is 'caused' by malloc itself or by the ROM region in the middle of the address space, but you never called free, so it's not caused by free.

0: Less naive allocation strategies take more work to 'trap' like this, but I can work out a example for best-fit or whatever if you're sceptical that this isn't just a problem specific to first-fit.

alright2565 · on Nov 7, 2022

That makes a lot of sense, I hadn't considered that! Thank you for pointing out a gap in my knowledge.

a1369209993 · on Nov 8, 2022

The underlying problem is that malloc-style dynamic memory requires the allocations to stay in place from malloc until free, so if malloc makes a bad decision early on, it can't fix it once it finds out it made a mistake. And any nontrivial decision malloc makes can be made retroactively bad with the right (wrong?) set of subsequent requests.

girvo · on Nov 8, 2022

This is part of why ESP-IDF has the heap_caps_* (malloc, calloc, free, etc.) interfaces, but it’s some decent mental overhead to have to use a lot of it. Has meant a lot of planning and white boarding for the few places we haven’t been able to avoid it.

hra5th · on Nov 7, 2022

In addition to the other comment, the primary problem with malloc() in embedded environments is that you push the memory exhaustion error to runtime, rather than compile time (not memory fragmentation). A call to malloc() in a rarely-called function might not exhaust available RAM until a system has been deployed for days or weeks, which is much worse than finding out your program requires too much memory at compile time, or at boot time. While stack overflows are always possible, many embedded projects at least have static analysis tools that can (with varying degrees of soundness) calculate worst-case stack use.

kazinator · on Nov 8, 2022

> To prevent usage of malloc() and free(), it is best to remove any heap definition in the linker file, to cause a linker error.

Really? That is the best alternative compared to:

- create a linker warning against malloc, calloc, realloc, ...

- get the symbols of the compiled image, and fail the build if the above are referenced

- other ideas: eliminating the allocator code from the image? (Why have it there if it's not called?)

kstenerud · on Nov 8, 2022

> - get the symbols of the compiled image, and fail the build if the above are referenced

He explains how to detect the symbols later in the article.

> - other ideas: eliminating the allocator code from the image? (Why have it there if it's not called?)

Because you want to detect if it is called indirectly by some function you're calling.

nomel · on Nov 8, 2022

"Prevent" has a precise meaning here, which I've had projects that require.

> create a linker warning against malloc, calloc, realloc, ...

This prints something on a screen. It does not prevent the usage.

> get the symbols of the compiled image, and fail the build if the above are referenced

It was used, and now you've detected its usage. It did not prevent the usage.

> eliminating the allocator code from the image? (Why have it there if it's not called?)

This requires attention, and a deep knowledge of all the code. It doesn't prevent the usage, since you’re free to type “malloc()” with your fingers.

Sometimes you need an explosive guarantee!

kazinator · on Nov 8, 2022

> This prints something on a screen. It does not prevent the usage.

  -Werror=...?

> It was used, and now you've detected its usage. It did not prevent the usage.

  foo_image: $(OBJS)
       ....
       if $(NM) $@ | grep -s -E 'malloc|...' ; then \
          echo "dynamic allocation not allowed" ; \
          exit 1 ; \
       fi

nomel · on Nov 9, 2022

I guess it's opinion, but these are both "soft" checks, in my eyes. To me, "prevent" means it's not possible. These solutions are external checks, rather than "it literally can't work". In our case, it literally could not be allowed to work.

chrisseaton · on Nov 7, 2022

I thought modern systems allocated heap memory using mmap - I didn't realise the heap start and the heap break were still a done thing?

cperciva · on Nov 7, 2022

You're right -- about modern systems. The article is aimed at embedded systems, which are generally less than modern.

comex · on Nov 8, 2022

Any software written for microcontrollers or other low-memory environments is probably not going to use mmap or, more generally, anything that dynamically manipulates memory mappings (if there even is an MMU in the first place).

That said, the specific use of a function called sbrk seems odd and maybe archaic, since it has no useful role if the heap is fixed-sized anyway. But it looks like newlib (the libc being used here) uses _sbrk as part of an abstraction that supports many different environments [1], including not just bare metal but also running as a userland program under Linux and other kernels. So maybe it makes sense.

[1] https://github.com/eblot/newlib/search?q=_sbrk&type=code

8jy89hui · on Nov 8, 2022

The kernel is forced to zero memory that is allocated with mmap. This is very expensive and it is better to use sbrk and re-use allocated memory when possible. As such, the glibc malloc uses a sliding threshold to ensure that large long-lived allocations are mmaped whereas short allocations live inside "arenas" which are basically giant doubly linked lists that can expand with sbrk.

saagarjha · on Nov 7, 2022

I believe the glibc allocator uses sbrk on the main thread or something like that.

stefan_ · on Nov 8, 2022

On a bare metal system, the heap is just a bunch of values in a linker script.

gumby · on Nov 7, 2022

Just run nm and grep for malloc or sbrk. That will tell you at compile time.

beached_whale · on Nov 7, 2022

I think they could have added `-ffreestanding` to their g++/gcc calls and the parts of the std library available won't throw/allocate

bilsbie · on Nov 7, 2022

Can anyone explain like I’m a python programmer?

cozzyd · on Nov 7, 2022

If you've ever used CircuitPython you'll be familiar with restarting the interpreter often because you're out of memory.

AdamH12113 · on Nov 8, 2022

The article provides an easy way of triggering a linker error if your microcontroller code inadvertently pulls in a library that does dynamic memory allocation.

I think most things (?) are dynamically-allocated in Python. This kind of manual memory management doesn’t exist there, which is one of the reasons it’s not used for bare-metal programming.

Animats · on Nov 7, 2022

Oh, in C/C++. Title should say that.

In Rust, you just use no_std for small embedded work.

sarahsbasar · on Nov 8, 2022

And in C you use -ffreestanding/nostdlib

The problem was how to make sure 3rd party libraries don't allocate.

3a2d29 · on Nov 7, 2022

Doesn’t the compiler do some hidden allocations?

And as a follow up, why use rust if allocations mean ownership rules are no longer needed?

SAI_Peregrinus · on Nov 8, 2022

Rust with no_std (and without the Alloc crate) won't do any hidden heap allocations. There's no allocator at all in such an environment! There is still a stack, and there are still static allocations. And of course you can still allocate statically in your linker script.

Ownership still matters, because nothing about ownership cares about heap allocation vs stack allocation.

abraxas · on Nov 8, 2022

Thanks for clarifying this. My dabbling in Rust a few years ago ended when I could not find a satisfying answer to the question "how do I prevent any unwanted dynamic allocations in this language?". At the time I found no good answers and gave up on trying to learn the language as that was not the kind of thing I was willing to relinquish to the wisdom of the compiler.

pjmlp · on Nov 8, 2022

In Ada one uses the Ravenscar profile.

cozzyd · on Nov 7, 2022

malloc should be eliminated from the image if unused, so can't you just check the symbols in your image to make sure it's never called?

chrsig · on Nov 7, 2022

what about `mmap`?

FooBarWidget · on Nov 7, 2022

Not all systems have mmap. Embedded systems may not even have virtual memory.

pyrolistical · on Nov 7, 2022

Or just use zig

throwawaymaths · on Nov 7, 2022

You should probably add some context for this to be meaningful.

Could be:

- zig stdlib functions take an allocator as a parameter for dynamic memory datatypes and you can supply a static allocator

- in the near future it will be possible to link c libraries against a modular libc written in zig, so you will be able to use c code that uses malloc in a statically allocation-only context.

The second one should be exciting for embedded c developers because with a little work zig could be worked into a c toolchain so that with otherwise almost no zig code you can use a c library, or, ambitiously a c++ library, that assumes malloc in an embedded application

mhh__ · on Nov 8, 2022

Yeah just rewrite the whole project bro

andrewflnr · on Nov 7, 2022

Is this going to be the meme, now?

com2kid · on Nov 8, 2022

Could have said "use any language that has built in control over allocation strategies".

Sad that C doesn't. Ada does, Rust does, Zig does, plenty of other languages do, C does not, which is hilariously odd considering how often C has to run in places that don't have a heap.

andrewflnr · on Nov 9, 2022

Yeah, they could have said that. It would have been smarter. They didn't, though.

I occasionally see this where I respond to what someone says, and get a reply along the lines of "they could have said this other thing", like that's relevant. It baffles me every time. They didn't say your smarter thing, they said their silly thing.

I guess projecting your smarter ideas on someone else's comment is one of the better mistakes you can make...

gpderetta · on Nov 8, 2022

Yes! At least till the end of the week then we will be looking for the next glorious replacement.