Without having looked at OpenBSD's implementation, just unmapping free pages does not mean you catch all writes to free()d memory. If a block is smaller than the pagesize, which many times is the case, a page will likely still hold other objects so it wont be unmapped and therefor not trigger a segfault.
Valgrind is my tool of choice to find invalid reads or writes. It catches all of them plus it gives you nice information like a stacktrace.
What OpenBSD's implementation of malloc() does get you though is improved coping with memory fragmentation.
Allocating memory in a continously heap brings lots of pain - jemalloc (which is the default allocator in FreeBSD and NetBSD) solves this nicely by also only using mmap() instead of sbrk() on Linux.
Note: I should have said "resource reclamation" instead of "fragmentation". See discussion with ajross below for details.
Or simply if the address in question has been remapped since the original free. For big 32 bit processes that's very likely. There's nothing wrong with the implementation really, it's just oversold.
Obviously the valgrind advice is great. But I don't follow your point about mmap vs. sbrk regarding fragmentation. Fragmentation is a property of memory addresses, not the syscall used to allocate them. A big linear heap allocated with sbrk will fragment just as badly as the same heap allocated in a single block, even if it's allowed to have "holes" in it.
Maybe you're claiming that jemalloc deliberatly leaves gaps between allocations? That can help a little in practice, though the real effect is just to change the size of the fragmented blocks.
Actually also with a 64 bit address space, you can easily get the same address if the allocator returns a cached block.
The problem with sbrk() is that if you allocate a few megabytes of memory + a tiny chunk, even after freeing everything apart from the tiny chunk near the "program break", nothing will be given back to the OS.
If you use mmap() instead though, all pages can be given back apart from the one where the tiny chunk resides in. This makes for a tremendous difference sometimes.
OK, I understand. The symptom you're describing isn't really "fragmentation". Fragmentation is the inability to use smaller blocks of memory because the larger allocations won't fit. That behavior isn't changed by this.
You're talking about a resource reclamation issue. Unmapping a page is a clear signal to the kernel that the memory is unused and can be repurposed immediately. Otherwise, it needs to find and eject a page from memory using the VM system, which is more expensive (though I'd guess not a lot more expensive except in pathologically allocation-bound systems).
You are right, it's not fragmentation but resource reclamation. I've added a note to the initial post.
I'm not sure though that the VM system can reclaim memory in the heap allocated with sbrk(). At least I've never seen that before. Or do you mean "eject" as in swap out?
The mmap() implementation is awesome. I used to write code for a network hardware management computer which ran OpenBSD on a memory-constrained Soekris (quasi-embedded environment more or less). The fact that OpenBSD's implementation returns memory when allocations are free()'d saved my bacon more than once. With another malloc I'd be out of memory way sooner.
It can, but this system was running on a Compact Flash (real small, I think maybe 32M usable space) so swapping was out of the question. This was also before CF got good and started mapping in good sectors when bad got overwritten.
"Valgrind is my tool of choice to find invalid reads or writes. It catches all of them plus it gives you nice information like a stacktrace."
Be careful! It's easy to depend on -Wall and Valgrind and still overlook subtle legal but "doh!" errors in array or pointer logic. (Humbly speaking from experience...)
the windows heap manager can be configured to do this on a per-application basis using the 'gflags' utility.
it's a useful debugging tool. you're not assured that use-after-free is going to be tightly temporally coupled with the free (a lot of the time it is though) and as more time passes after free() the odds increase that the dangling virtual address is re-allocated.
it is also the enemy of performance. you should probably not run production things with the heap configured in this way.
I have a number of custom allocators that use the same techniques, with some improvements.
When space is cheap:
All allocations are two 4k pages
The returned pointer is alligned with the end of the buffer to detect overruns. (with a 16byte allignment)
The page following the alloc is always denied.
The space around the allocation is filled with flag values, and these are checked on free.
After free the pages are held in storage for a few thousand following allocations.
The variants of this allocator does things like only doing this for specific ranges of allocation sizes or only after a certain number of allocations.
With this a good number of overruns and use after free bugs have been found.
Mostly used this technique on windows with delphi, on linux i prefer valgrind.
Ah. It sounded like you were mapping 2 4k pages for the allocation itself, aligning the allocation at the end of that 2-page span, and then mapping the following page and marking it as denied.
I wasn't the original poster, that was just my impression of the scheme. I could have misunderstood, in which case I don't have any alternate theories for what that second page is for :-)
Is this really different from the glibc malloc() implementation with MALLOC_CHECK_ set to 3?
> MALLOC_CHECK_ is designed to be tolerant against simple errors, such as double calls of free() with the same argument, or overruns of a single byte (off-by-one bugs). Not all such errors can be protected against, however, and memory leaks can result.
> If MALLOC_CHECK_ is set to 0, any detected heap corruption is silently ignored;
> if set to 1, a diagnostic message is printed on stderr;
> if set to 2, abort(3) is called immediately;
> if set to 3, a diagnostic message is printed on stderr and the program is aborted.
Yes, the OpenBSD technique of completely unmapping the memory rather than re-using it will be able to catch even some read accesses, not just write accesses that happen to corrupt malloc's guards.
You could easily do this on Linux, where the executable format allows you to override library-defined symbols, including malloc(). Implement the new malloc/calloc/realloc/free as a static library and link it to your dev builds. Don't link it for releases.
You seem to miss the difference between "developer testing his/her app" and "everyone testing every app".
To the point, I'm using "/etc/malloc.conf -> AFGJPRX" on most of my systems, which means that virtually everything that runs there is checked. Can you easily do that on Linux? How many apps are you running via valgrind on a daily basis?
In addition to adding nothing to this thread your statement is incorrect. The oxford dictionary explicitly defines a figurative definition of leverage.
If this usage bothers you I must caution you against working in a finance related field. One or two days would expose you to countless instances of using leverage in a figurative sense.
Valgrind is my tool of choice to find invalid reads or writes. It catches all of them plus it gives you nice information like a stacktrace.
What OpenBSD's implementation of malloc() does get you though is improved coping with memory fragmentation. Allocating memory in a continously heap brings lots of pain - jemalloc (which is the default allocator in FreeBSD and NetBSD) solves this nicely by also only using mmap() instead of sbrk() on Linux.
Note: I should have said "resource reclamation" instead of "fragmentation". See discussion with ajross below for details.