Hacker News new | past | comments | ask | show | jobs | submit login
GCC's new fortification level: The gains and costs (2022) (redhat.com)
183 points by fanf2 5 months ago | hide | past | favorite | 105 comments



> the program continued using the same pointer, not the realloc call result, since the old pointer did not change

> In this context, it is a bug in the application

This is a non-intuitive result by the normal lens C pointers are viewed by; these two pointers compared equal, how could using one differ from using the other? Pointer provenance rears its ugly head here though; one of those "pointers that have the same value"'s provenance is different to the other. By the standard, they _are_ allowed to be treated differently, and honestly the standard requires it.

https://www.ralfj.de/blog/2020/12/14/provenance.html is pretty accessible, and shockingly on point: "just because two pointers point to the same address, does not mean they are equal in the sense that they can be used interchangeably".

EDIT: spelling


Provenance might be used as justification now but the actual rules are simpler and stricter. After freeing (or reallocing) a pointer, the application must not inspect the pointer value anymore. Even `new_pointer == old_pointer` is not allowed.

IIRC, one justification for this was to account for systems with non-flat memory where inspecting the value of the old pointer might cause a processor exception.


Wow, look, move semantics! An advanced compiler could even check for that...


> Even `new_pointer == old_pointer` is not allowed.

This is legal. But dereferencing old_pointer even after this check had passed is undefined.


No, it is not, and it has never been OK for as long as C has been standardized by ANSI or ISO.

C89 says that "The value of a pointer that refers to freed space is indeterminate." and that behavior is undefined "upon use ... of indeterminately-valued objects", hence a compiled program can e.g. behave as if `new_pointer == old_pointer` even though the object was relocated in memory.


Huh, okay, I didn't know that, but apparently it is true.

Using clang, program #1:

    #include <stdio.h>
    #include <stdlib.h>
    int main(int argc, char **argv) {
        void * p = malloc(123);
        void * q = realloc(p, 200);
        printf("%p == %p -> %i\n", p, q, p == q);
        free(q);
        printf("%p\n", q);
        return 0;
    }
prints out:

    0x5f867a8a9300 == 0x5f867a8a9300 -> 1
    0x5f867a8a9300
Program #2: (only difference is the extra printf after malloc)

    #include <stdio.h>
    #include <stdlib.h>
    int main(int argc, char **argv) {
        void * p = malloc(123);
        printf("%p\n", p);
        void * q = realloc(p, 200);
        printf("%p == %p -> %i\n", p, q, p == q);
        free(q);
        printf("%p\n", q);
        return 0;
    }
prints out:

    0x5bcba8225300
    0x5bcba8225300 == 0x5bcba82257a0 -> 0
    0x5bcba82257a0
So if we print out the pointer before reallocation then they're not equal, but if we don't then they are equal.

Funny enough, "-fsanitize=undefined" doesn't seem to detect this. Neither does "-fsanitize=address" (but with ASAN the results are now consistent and in both cases compare to not equal).


If the standard says it is invalid then they better make the value of the variable holding the pointer invalid

But the malloc/free interface is one of the most badly designed interfaces in the C lang it's not even funny


They have made it invalid. You’re not supposed to do it.


That standard says free takes a void*, i.e. it takes the pointer by value, and that passing values by value to a function prevents the callee from changing them.

More literally, your pointer value is copied into a register before the call into libc, so free can't change the value even if it wants to. Realloc can't either for the same reason.

That provenance has come up with the premise that a ! = a for a 64bit integer value is an error in the formalism, specifically the language is deliberately inventing things that cannot be so on hardware.


Note that (a) pointer representations can contain bits representing the provenance that don’t participate in equality comparison, and (b) a “C implementation” in terms of the C standard includes the compiler, which means the compiler is allowed to track pointer provenance statically even when the representation in memory doesn’t differ (e.g. exactly what GCC is doing here). In addition, a C implementation is also allowed to track pointer provenance externally at runtime, again even if the pointer representation (as exposed by memcpy into a char buffer) is the same.


To the extent that your compiler has embraced the provenance model, yes. But that does mean that you can't assume that the language pointer is the one the machine gave you and therefore probably can't write code that does anything with pointers other than pass them around as opaque handles. No testing their alignment, no comparing them to each other.


> specifically the language is deliberately inventing things that cannot be so on hardware

So what? The standard has "undefined behavior"; real implementations always do something, even if that something cannot be determined in advance. The standard is the standard, it's not a machine.


Well, at some point in history C was about telling a physical machine what to do. You can see that in the operations it exposes.

I don't know what modern C is for. The one that manipulates an abstract machine with concurrency oracles and time travelling metadata on object identifiers. It looks like an aberration derived from C++ to me.


Every compiler for C has always done things like register allocation that make it not assembly.


Would it be acceptable to save the old address as a uintptr_t before the realloc(), and compare it with the new one after?


You could do it legally, but the results are not required to be what you want them to be.


This is only ever a problem in hosted C and its standard library malloc and free, right? If you write freestanding C with your own memory allocator you made from scratch, then the compiler will never make such assumptions. Right?

That wheel is certainly in need of some serious reinvention anyway. This seems like a good starting point:

https://nullprogram.com/blog/2023/12/17/


It also makes sense for an allocator that can satisfy new allocations with recently freed allocations.


It doesn't mean that but maybe it actually should. The fact these perfectly good intuitions don't apply because of completely irrelevant reasons is a huge cause of friction and bugs.

Your article contains the perfect example:

> UB on integer overflow is a compiler-only concept

> every target supported by the compiler will do the obvious thing and just produce an overflowing result

Pretty much every computer that matters works the way you expect it to: the value will overflow, some flag will be set, etc. The compiler couldn't care less though. The compiler decrees that it shall be undefined. You know it works but it doesn't because the compiler refuses to do what you expect it to do because it's "undefined".

Well then just define it for god's sake. I'm so tired of the uncertainty. Tired of playing these games with the compiler. Undefined? I don't even want to read that word ever again.

Let's define this right now.

  -fwrapv
There. It is now defined to be what everyone expects and wants. Now the optimizer won't be getting "clever" with this and generating complete garbage code as result. The compiler's optimizer is no longer your enemy in this case. It won't be erroneously optimizing entire loops to some random constant anymore. One would think they'd emit a warning instead in these cases but no.

Strict aliasing is yet another massive pain that you need to deal with because of compilers and their "optimizations". People have actually told me to "launder" pointers through asm statements so the compiler can't make nonsense assumptions about that stuff.

This is C. We do things like write memory allocators. Of course we're going to alias stuff. I have a structure type, I have a byte buffer with the data, of course I want to overlay the type on the byte buffer and just access it directly. Why is it that the compiler just doesn't let me reinterpret memory however I want? Compiler has no business making my life hell because of this.

Turns out the C standard prescribes some "strict aliasing" nonsense because Fortran has it or some other reason nobody really cares about. If you're doing anything at all in C you're probably violating this.

  -fno-strict-aliasing
There. You can now do what you want to do without the compiler getting all clever about it and ruining your day with nonsense code generation. Now it won't be reordering your code into nonsense just because it can "prove" that two pointers can't be equal even though you literally assigned one pointer to the other.

Over time I've built up this little collection of C compiler flags and they've become unconditional overrides in all my makefiles. No matter what users pass in via CFLAGS or whatever, these little things still get disabled. I have no idea what the performance impact is and honestly I don't care.

There's quite a lot of documented compiler flags, I couldn't evaulate every single one of them. If anyone here knows of any other useful flags that define the undefined, please reply to this comment. I'll add them to my makefiles without thinking twice.


> It won't be erroneously optimizing entire loops to some random constant anymore. One would think they'd emit a warning instead in these cases but no.

It also won't be optimizing other loops that you want it to optimize. Turns out it's a tradeoff and lots of people choose C and C++ exactly because of this focus on speed..

> This is C. We do things like write memory allocators. Of course we're going to alias stuff. I have a structure type, I have a byte buffer with the data, of course I want to overlay the type on the byte buffer and just access it directly. Why is it that the compiler just doesn't let me reinterpret memory however I want? Compiler has no business making my life hell because of this.

char is allowed to alias other types. Other types of aliasing are rarely needed and you can always memcpy the contents if you really do need to.

> Turns out the C standard prescribes some "strict aliasing" nonsense because Fortran has it or some other reason nobody really cares about. If you're doing anything at all in C you're probably violating this.

No prescribes it because otherwise it would need to load from memory on every pointer dereference after something completely different has been written to. This is obviously undesirable if you care even one bit about performance.

None of these things are the compiler fighting you. In fact it is trying to help you. But you seem to want something closer to high-level assmebler and you can have that: don't use and optimizing compiler or use flags to change the language semantics to your taste. That doesn't make the default semantics bad though.

> I have no idea what the performance impact is and honestly I don't care.

Then C and C++ are simply not made for you. There are plenty of "safe" languages for people who don't care about the perfomance impact.

> If anyone here knows of any other useful flags that define the undefined, please reply to this comment. I'll add them to my makefiles without thinking twice.

Try -O0.


> It also won't be optimizing other loops that you want it to optimize. Turns out it's a tradeoff and lots of people choose C and C++ exactly because of this focus on speed..

Hmm, could possibly include optimization flags directly into the source code (would be clearer and less compiler mess perhaps). I could also see this causing other issues...


Most compilers have (non-standard) constructs to control optimizations


> It also won't be optimizing other loops that you want it to optimize. Turns out it's a tradeoff and lots of people choose C and C++ exactly because of this focus on speed..

Yeah. That's sad but there's nothing that can be done about it. If you write C you're basically guaranteed to violate strict aliasing at some point and I just cannot handle the uncertainty anymore. The Linux kernel developers are much smarter than me and as far as I know they reached the same conclusion as I did: this aliasing stuff is not even worth enabling.

I know there's a may_alias attribute you can apply to pointers to warn the compiler but I'm not confident in my ability to do it 100% correctly. I have a feeling it'd end up getting applied to a significant chunk of the code anyway, might as well just turn it off.

> char is allowed to alias other types.

I know. But can uint8_t? It's basically the same type as unsigned char but it's still a distinct type in the compiler's eyes. Never found a definitive answer to that question. What if you define your own u8 type?

I know for a fact uint16_t can't safely alias anything. Implementing a 16 bit hash function? Hoping to just treat an arbitrary buffer as an array of u16? Not gonna happen unless you turn off this aliasing nonsense.

> Other types of aliasing are rarely needed and you can always memcpy the contents if you really do need to.

I shouldn't have to copy huge buffers all over the place just to get the compiler to generate the simple instructions I want. This is on the same level as the "pointer laundering" advice I've received. Oh just pass the pointer through some inline assembly, compiler can't see into it so it assumes anything can happen and it won't screw up the code trying to optimize. This memcpy is just yet another thing the compiler can't see into. The proper solution is to kill the problematic optimizations.

> No prescribes it because otherwise it would need to load from memory on every pointer dereference after something completely different has been written to. This is obviously undesirable if you care even one bit about performance.

It doesn't have to be that way. I statically link all my code, enable link time optimization, pass flags like -fwhole-program. Surely the compiler can analyze the entire program and statically determine that certain pointers do not alias.

I have a memory allocator that works with a huge statically allocated byte buffer as its backing memory. Surely the compiler can determine that all pointers returned by my memory allocation functions are subsets of that buffer and that they do not necessarily alias each other.

> use flags to change the language semantics to your taste

That's exactly what I was advocating for in my post.

> That doesn't make the default semantics bad though.

I would argue that they are extremely bad. Compilers will straight up delete overflow and null checks from your code if the conditions are right. This can cause it to "optimize" entire functions into literally nothing. There's just no way anyone can convince me that this is not adversarial, if not malicious.

> Then C and C++ are simply not made for you.

C was made to support the development of Unix. I'm writing similar low level Linux operating system stuff. I'd say I'm closer to the target audience than all these speed obsessed folks ever were.

> There are plenty of "safe" languages for people who don't care about the perfomance impact.

I absolutely do care about performance in general.

I said I don't care about the performance impact of that one flag. Adding that flag buys me the certainty that the compiler won't screw the code up based on absolute nonsense assumptions that I don't ever want it to make. I've made the decision to not care in this case. I absolutely do want all the other optimizations the C compilers will give me.

And I don't want "safe", I want "sane".

Things like signed integer overflow not being C's particularly idiotic "can't ever happen, please delete any checks" flavor of undefined falls under my definition of "sane". I also consider overlaying a structure on top of a byte buffer to be such a basic operation there's no point in using C at all if the compiler is gonna screw this up.

> Try -O0.

No. That disables too many useful optimizations that are not problematic in any way whatsoever.


> I have no idea what the performance impact is and honestly I don't care.

Then why are you writing C? You could be writing some other, higher-level language instead if performance is not your top priority.

Perhaps you use C because you must interact with other C code; many languages have quite usable C FFI. If you must use C because your code is part of another project (e.g. the Linux kernel), you don't have control over the flags anyway. If you must use C because of low memory availability (e.g. a microcontroller), then perhaps fine — but that's quite niche. (And there's always Rust.)


C was hardly a performance juggernaut in the 1980's, any junior Assembly coder could easily outperform the outcome of the C compilers.

It was by exploring UB to the extreme those compilers optimizers do nowadays, that such language finally got the fame it has 40 years later.

"Oh, it was quite a while ago. I kind of stopped when C came out. That was a big blow. We were making so much good progress on optimizations and transformations. We were getting rid of just one nice problem after another. When C came out, at one of the SIGPLAN compiler conferences, there was a debate between Steve Johnson from Bell Labs, who was supporting C, and one of our people, Bill Harrison, who was working on a project that I had at that time supporting automatic optimization...The nubbin of the debate was Steve's defense of not having to build optimizers anymore because the programmer would take care of it. That it was really a programmer's issue.... Seibel: Do you think C is a reasonable language if they had restricted its use to operating-system kernels? Allen: Oh, yeah. That would have been fine. And, in fact, you need to have something like that, something where experts can really fine-tune without big bottlenecks because those are key problems to solve. By 1960, we had a long list of amazing languages: Lisp, APL, Fortran, COBOL, Algol 60. These are higher-level than C. We have seriously regressed, since C developed. C has destroyed our ability to advance the state of the art in automatic optimization, automatic parallelization, automatic mapping of a high-level language to the machine. This is one of the reasons compilers are ... basically not taught much anymore in the colleges and universities."

-- Fran Allen interview, Excerpted from: Peter Seibel. Coders at Work: Reflections on the Craft of Programming


> C was hardly a performance juggernaut in the 1980's, any junior Assembly coder could easily outperform the outcome of the C compilers.

I think that's not especially relevant. The issue is that most other higher level programming languages had poorer performance and/or inserted undesirable mysteriously-timed lags in performance. Fortran had excellent performance, but Fortran at that time couldn't handle many constructs that C can. Some Common Lisp implementations at the time had reasonable performance, but the garbage collectors often inserted pauses (ugh!), and many were expensive at the time too.

Also: The 1980s was the rise of the IBM PC-compatible (and the Macs too). These were machines whose underlying chips matched C decently well (they had hardware-implemented large stacks unlike the 6502), however, these machines were underpowered compared to what some people wanted them to do. This also had a rise in Unix servers, where C was their native tongue.


The 21st century renaissance of high level languages got us somewhat back on track.


I write C because I like C. That's the only reason I've got. It's what I feel comfortable writing code in. I tried learning Rust but I just couldn't get used to it.

I like C where I write the functions and the structures and the compiler just translates it to the simple processor instructions you expect it to. No complicated hidden machinery under the hood, just simple unmangled symbols pointing at simple code that conforms to simple ABIs. This simplicity is also the reason the so called "C ABI" is so ubiquitous.

Another reason I write C is it's one of very few languages with freestanding code support. I only write freestanding C targeting the Linux kernel-userspace ABI. In my opinion C is a much nicer language without the libc and its legacy. I really enjoy it.


> Then why are you writing C? You could be writing some other, higher-level language instead if performance is not your top priority.

This goal seems somewhat oxymoronic - if the "goal" is to be keenly aware of what the compiler will do and double check all assembly, then you shouldnt use C. If the goal is to ignore assembly and not pay attention to these sorts of things, then you shouldnt use C.

The only valid niche I see is where the goal is to use C, not really have to worry about the assembly, have it do the "right" thing on your platform. I.e. no undefined behavior, just underperforming (and called out as such). Then you can go in and "approve" various optimizations.


> Then why are you writing C? You could be writing some other, higher-level language instead if performance is not your top priority.

If performance is your top priority C is also the wrong choice.


What would you suggest instead?


If you are writing general purpose software, I would suggest Rust if I have nothing else to go on, although there are plenty of other good choices if you know more about the problem.

In a few cases there is today a special purpose language for the kind of thing you're writing which will knock both C and Rust on their backs for performance because it was never trying to do anything else.


I think C is still a good choice for the following reasons: - Performance is generally very good and while special languages can sometimes be better it usually easily possible to get the C code to the same level. - But with special languages you will often run into limitations where you are stuck. - It has long term stability - Compile-times are short which I really like - There are good tools - It has very good interoperability with everything - The code an easily be reused from most other languages, so it does not tie you into a specific framework - etc..


It’s been a long time since I worked with C, but my recollection was that a) strict aliasing allows for optimizations that are actually worthwhile, and b) it’s really easy to type pun in a defined way using unions anyway.


Type punning with unions was actually forbidden by C89. You were only ever supposed to read the union member which was last written to. This may have been relaxed in C17; I can only find a draft online, but it allows for type punning in unions as long as the member being read is not longer in size than the member last written to.


What the standard says doesn't really matter. Only what major compilers do matters. GCC has decreed that type punning through unions is supported, therefore it might as well be standard.


IIRC it was supported in C89 and described as implementation-defined and C99 changed the wording and mentions union type punning.


The optimizations can easily be gained back by "manual" loading and storing to temporary local variables.

The classical:

  int foo(float *f, int *x) {
    *x = 2;
    *f = 3.0f;
    return *x; // oh no, without typed based aliasing I have to load x again!
  }
Can obviously be rewritten to:

  int foo(float *f, int *x) {
    int z = 2;
    *x = z;
    *f = 3.0f;
    return z; // ah, thank you programmer, z has not had its address taken, it's obviously 2.
  }


That code does not violate the aliasing rules in any case.

The two functions you wrote are not the same; the first re-reads *x which may return a different value than 2 if *x was modified in between the first and third lines of the function by another thread (or hardware). However, since x is not marked volatile, the compiler will usually optimize the first function to behave the same as the second.


> strict aliasing allows for optimizations that are actually worthwhile

I don't think there are many sensible, real world examples.

A nice explanation of the optimizations the strict-aliasing rule allows: https://stackoverflow.com/a/99010/66088

The example given is:

    typedef struct Msg {
        unsigned int a;
        unsigned int b;
    } Msg;

    void SendWord(uint32_t);

    int main(void) {
        // Get a 32-bit buffer from the system
        uint32_t* buff = malloc(sizeof(Msg));

        // Alias that buffer through message
        Msg* msg = (Msg*)(buff);

        // Send a bunch of messages
        for (int i = 0; i < 10; ++i) {
            msg->a = i;
            msg->b = i+1;
            SendWord(buff[0]);
            SendWord(buff[1]);
        }
    }
The explanation is: with strict aliasing the compiler doesn't have to think about inserting instructions to reload the contents of buff every iteration of the loop.

The problem I have is that when we re-write the example to use a union, the generated code is the same regardless of whether we pass -fno-strict-aliasing or not. So this isn't a working example of an optimization enabled by strict aliasing. It makes no difference whether I build it with clang or gcc, for x86-64 or arm7. I don't think I did it wrong. We still have a memory load instruction in the loop. See https://godbolt.org/z/9xzq87d1r

Knowing whether a C compiler will make an optimization or not is all but impossible. The simplest and most reliable solution in this case is to do the loop hoisting optimization manually:

        uint32_t buff0 = buff[0];
        unit32_t buff1 = buff[1];
        for (int i = 0; i < 10; ++i) {
            msg->a = i;
            msg->b = i+1;
            SendWord(buff0);
            SendWord(buff1);
        }
Doing so removes the load instruction from the loop. See https://godbolt.org/z/ecGrvb3se

Note 1: The first thing that goes wrong for Stackoverflow example is that the compiler spots that malloc returns uninitialized data, so it can omit the reloading of buff in the loop anyway. In fact it removes the malloc too. Here's clang 18 doing that https://godbolt.org/z/97a8K73ss. I had to replace malloc with an undefined GetBuff() function, so the compiler couldn't assume the returned data was unintialized.

Note 2: Once we're calling GetBuff() instead of malloc(), the compiler has to assume that SendWord(buff[0]) could change buff, and therefore it has to reload it in the loop even with strict-aliasing enabled.


The strict aliasing stuff allows you to do "optimisations" across translation units that are otherwise unsound.

The compiler alias analysis is much more effective than those rules permit within a translation unit because it matters whether int* alias other int*.

And then we have link time optimisation, at which point the much better alias analysis runs across the whole program.

What remains therefore is a language semantically compromised to help primitive compilers that no longer exist to emit slightly better code.

This is a deeply annoying state of affairs.


Aliasing analysis is quite helpful for sophisticated compilers to generate good code.


Alias analysis is critical. Knowing what loads and stores can alias one another is a prerequisite for reordering them, hoisting operations out of loops and so forth. Therefore the compiler needs to do that work - but it needs to do it on values that are the same type as each other, not only on types that happen to differ.

Knowing that different types don't alias is a fast path in the analysis or a crutch for a lack of link time optimisation. The price is being unable to write code that does things like initialise an array using normal stores and then operates on it with atomic operations, implement some floating point operations, access network packets as structs, mmap hashtables from disk into C structs and so forth. An especially irritating one is the hostility to arrays that are sometimes a sequence of simd types and sometimes a sequence of uint64_ts.

Though C++ is slowly accumulating enough escape hatches to work around that (launder et al), C is distinctly lacking in the same.


Alias analysis is important. It's the C standard's type-based "strict aliasing" rules which are nonsense and should be disabled by default.

This is C. Here in these lands, we do things like cast float* to int* so that we can do evil bit level manipulation. The compiler is just gonna have to put that in its pipeline and compile it.


How does the version with buf0 and buf1 work? It looks like it sends always the same two values...


Hmmm, yes. I didn't understand what the code did.

Instead of creating those buff0 and buff1 variables before the loop, I should have done:

    for (int i = 0; i < 10; ++i) {
        unsigned a = i;
        unsigned b = i+1;
        msg->a = a;
        msg->b = b;
        SendWord(a);
        SendWord(b);   
    }
That gets rid of the load from the loop. https://godbolt.org/z/xsqWfxKzd


>> Pretty much every computer that matters works the way you expect it to: the value will overflow, some flag will be set, etc. The compiler couldn't care less though.

And RISC-V drops the hardware flags entirely. I questioned this when I read it, but "carry" is really only useful in asm code for adding and subtracting integers larger than register size, which is not much of a thing these days. None of the flags are accessible from C either so they just dropped them and things seem to be going well!


> but "carry" is really only useful in asm code for adding and subtracting integers larger than register size, which is not much of a thing these days

Funny you mention that. I'm studying math and algorithms to implement the arbitrary precision integer support for my programming language. The word "carry" is showing up a lot in my research so that bit does seem to be an incredibly useful feature and I have no idea why they'd remove it.

> None of the flags are accessible from C either so they just dropped them and things seem to be going well!

I know you can't directly access those flags but the fact is compilers do provide their functionality to programmers via builtins.

For example:

https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins...

Those builtins will cause the compiler to generate optimal instructions such as conditional jumps that depend on the carry or overflow bits.

I used the __builtin_mul_overflow function to check for overflow in my memory allocation code.

This is what we really want. We want to handle these conditions, no matter how they're implemented. This is a perfectly acceptable way for the compiler to expose these processor features in a way that makes sense to us programmers. In the case of RISC-V, GCC will just emit whatever code is appropriate for checking overflows on that processor.

What's definitely not acceptable is for the compiler to just go "aw man look at this USELESS overflow checking code over here, AS WE ALL KNOW signed integer overflow is UNDEFINED and therefore CANNOT EVER HAPPEN, why the hell is this shit for brains programmer checking for a thing that can't happen, I'll just go ahead delete this nonsense code literally right now, oh wow would you look at that, looks like I just optimized the entire program into a noop, I must be the best compiler in the universe".


> Pretty much every computer that matters works the way you expect it to

Except the ones that saturate, that produce trap values, etc.


The `realloc` problem isn't the only one - it also breaks many formerly-well-defined programs that use `malloc_usable_size`.

Instead of fixing this, the developers behind the "dynamic object size" push have been changing the documentation to declare any use of `malloc_usable_size` buggy even though it used to be explicitly documented as safe.

I suspect that GCC's optimization passes will break even C-standard-compliant use of `realloc`, similar to how ASAN can break due to dynamic equality checks between pointers of different provenance.

Life would be much simpler for many of us if the standards committee bothered to standardize a function that says "give me a buffer of semi-arbitrary size and tell me how big it is; I promise to resize it later", which is very widely intended. An explicit "realloc in place only, else fail" would also make many more useful programs feasible to write.


> it also breaks many formerly-well-defined programs that use `malloc_usable_size`

Does it? According to the documentation for malloc_reusable_size, it can be used to find out the actual size of an allocation, but you still have to call realloc before writing to bytes beyond the size passed to malloc.

"This function is intended to only be used for diagnostics and statistics; writing to the excess memory without first calling realloc(3) to resize the allocation is not supported."


Yes, that's what the documentation has been retroactively changed to say. It used to say that the memory was usable immediately.

Also, `realloc` is not guaranteed to actually operate in-place.


> give me a buffer of semi-arbitrary size and tell me how big it is; I promise to resize it later

I'm not sure how you could use this usefully. If you don't care what size you get, why would you allocate in the first place? And if you do have a minimum size you need right now, and a need for it to be bigger later, isn't that what a malloc/realloc dance is for?


Cases where the exact buffer size doesn't matter are ubiquitous, for example:

* read a file(-like) in streaming mode. Whenever the buffer is empty, fill it. The actual allocated size does not matter at all for most kinds of file.

* push objects onto a vector when the capacity is used up, reallocate at a larger size and keep pushing. The actual allocated capacity doesn't matter at any point.

* implement a bloom filter for approximate set membership. If the allocator happens to give you a little more than your mathematical estimation for the chances you want, you might as well use it.

In fact, I dare say: every allocation size that is neither `0` or `sizeof(T)` doesn't fundamentally care about the size (a given implementation may care, but if the standard bothered to implement useful new functionality we would change the implementations).

This is unlike, say, my desire for skewed-alignment allocators, which is but not particularly useful for most programs.


> In fact, I dare say: every allocation size that is neither `0` or `sizeof(T)` doesn't fundamentally care about the size

Not true? When you copy an array (common case: string) that is passed to you, you want the copy to the the same size. No more (that wastes memory) and no less (then you don't have a copy), just as with sizeof(T).


I hadn't thought about the `sizeof(T)` point before, and I can think of a handful of exceptions, but it's a great and expressive rule of thumb.


Example: reading a file line-by-line. You don’t, can’t, know how big a line is. Your best option is to allocate some random chunk of memory, like 4 kilobytes, read 4 kilobytes from the file into memory, and see if you happened across a newline in there. If you did, shuffle everything around a little / realloc / ringbuffer shenanigans; do your favourite. If you didn’t, make the buffer bigger (by 1.5x? 2x? Log2x?), and try again.

This dance is super common with variable length protocols over unframed streams - ie, most things over tcp. So this is an exceptionally common pattern in IO operations.

Other common times this pattern happens: finding all items in a list that satisfy a predicate; tracking items a user has added to an interface; cycle detection in graphs; …


I don't think this or the sibling would be improved with a function that gave you an arbitrarily sized allocation where you also need to query the size of that allocation though? In all the cases you need to know what the size of the buffer is, even though you don't care whether it's 1kb, 4kb, or 8kb (although I imagine you'd care if you got 16b or 16gb)


It's not about querying the size, it's about what the allocator has available without having to ask the operating system for new pages.

It's got a contiguous block of 6371 bytes? Cool I'll take that, the specific size didn't matter that much.


E.g. "Give me something in the ballpark of 4 KiB; more is OK, but tell me how much".

It would allow the allocator to return a chunk it has handy, instead of cutting it to size and fragmenting things, only to be asked to make the chunk larger a few microseconds later.

It will save both on housekeeping and fragmentation handling if the caller knows that the chunk will likely need to grow.


C++23 added `allocate_at_least`: https://en.cppreference.com/w/cpp/memory/allocator_traits/al...

I'm not sure if any standard libraries have an implementation that takes advantage of the "at least" yet.


It's a performance optimization for growing datastructures. If you can use all the space that was actually allocated, you can (on average) call realloc less often when the container is growing dynamically.


Life would be much simpler for many us if people would stop complaining on internet forms and starting contributing to open-source or standardization efforts.


Over many years I've made several attempts to contribute to GNU projects and I don't think I've ever succeeded. At some point I started to wonder if I just suck at all this. I don't seem to encounter any problems when I interact with any other project though so that can't be it.

And I don't mean simple reports either, I mean I've sent patches which ranged from bug fixes to minor and major features. Most recent example: sent a patch that added a separate path variable to bash specifically for sourcing bash scripts, thereby creating a simple library/module system. At some point people called my idea "schizophrenic" and I just left.

I was developing some GCC patches to add builtins that emit Linux system call code. This is just something I'm personally interested in. Lost the work when my hard drive crashed and I'm unsure whether to restart that little project. The people on the mailing list didn't seem to agree with it very much when I tried to justify it.

Honestly the idea that I might spend all this time and effort figuring out and hacking on the absolutely gigantic GCC codebase, only to end up with zero results to show for it, makes complaining on internet forums a very attractive alternative. Who knows, maybe someone who's already involved will read the comments and be convinced. Someone like you.


Fair enough. It is just that got involved for similar reasons. I wanted certain things to work and nobody listened to me or fixed the bugs I filed. Now I know first hand what tremendous amount of work it is to change anything, I am not so excited about comments such as "the standards committee should simply". It is never that simple for a variety of reasons. It would also be very important to file bugs to compilers and this is a simple way to contribute, even if this can be frustrating because often nothing happens for a long time.


I see what you mean. Yeah it's extremely difficult. I guess I learned that the hard way. Thought everything would proceed smoothly if I just showed up with the code already working. All they had to do is review it and apply it if they found no issues, right? Not so.

I've also filed a few bugs and feature requests on the LLVM issue tracker and GNU mailing lists. LLVM has like a trillion open issues, it's like shouting into the void. GNU almost always tells me the existing stuff should be enough and that I don't really need whatever it is I'm asking for.

For example, I requested a way to make the linker add extra PT_NULL segments to the ELF output file so that I could find and patch those segments later. LLVM has yet to respond to the issue I created, and GNU says the arcane linker scripting is enough even though the example they gave me didn't work. Only person who responded favorably to me was the maintainer of the mold linker, he added a couple lines of code and suddenly I could easily patch ELF segments. As a result, I fully integrated the mold linker into my makefile and switched to it.

My point is this:

> I am not so excited about comments such as "the standards committee should simply"

I can relate with that feeling. It's just that we're also not so excited by comments along the lines of "pull requests welcome". That's a direct challenge to rise up and directly participate. When we do, it often happens that we are not even given the time of day. To say it's frustrating would be an understatement.

I'm not going to enumerate every negative experience I've had, better to leave it all in the past. However, I've concluded that many times effort does not equal reward, and that it might just be easier to ask the insiders to change things instead and leave things be if they don't seem convinced. Change is difficult enough for the insiders to make, for an outsider it's orders of magnitude more difficult.


This is certainly true. I am also not in the position to see all my contributions accepted. But it is also unclear how to improve the situation. Big open-source projects certainly are dominated by commercial interests. I found GCC compared to others relative welcoming to outsiders - there are still quite some people with the original hacker spirit involved. To some degree one has to accept that in a community, one can not always get what one wants. But I certainly I agree though that it should be easier to get involved and be included. In my opinion there are two major problems: First, the overall community should value openness more and not simply prefer projects which may have some technical advantage but are then often controlled by only a couple of commercial entities. If we do not value openness, we do not get it. Second, I think complexity is a huge problem. I think we have to decompose and split our software into smaller parts, where each can be more easily be replaced. This would take away power from the people controlling huge frameworks such as LLVM and I think this would be a good thing. And finally, people need to be braver. You are not getting heard when you give up too easily. And if do not get your patch in, maybe create your own project, or maintain your fork, etc.


> But it is also unclear how to improve the situation.

Then please allow me to make a few suggestions based on my limited experience.

I'll begin by saying it's not really my intention to just show up out of nowhere and demand things. When everything is done, I'll go away and the maintainers will remain. They'll continue being responsible for the project while I get to not think about it anymore. It would be disrespectful to demand that they maintain code I wrote.

The key point I want to make is: it's profoundly demotivating when maintainers don't even engage with the contributors. People spend time and effort learning a project, making the change and sending in the patches in good faith. We ask only for their genuine consideration.

If the patch has issues, it's more than enough to reply with a short review detailing what needs to change in order for the patch to be considered. The most likely result of such a review is a v2 patch set being sent with those exact changes implemented. That is the nature of peer review.

What usually happens in my experience is the patches get straight up ignored and forgotten about for an untold amount of time. Then the maintainer suddenly shows up and implements the thing himself! That's nice, in the end I got the feature wanted I guess. I just didn't get to become a contributor. I leave wondering why I even bothered to do all that work.

That treatment makes me feel like I'm beneath them, beneath their consideration. I spent days discussing a patch on a mailing list. One person had numerous objections to the idea I was proposing, I tried to address them all but then it turned out he hadn't even read the patch I sent. That seriously almost made me quit on the spot.

The word peer in peer review is extremely important. Reading the work, considering it and offering genuine thoughts, this is how people treat their peers. That is true respect, even when nothing but criticism is offered. Someone who refuses to even read our code doesn't really see us as equals. We're not fellow programmers with ideas worth considering, we're schizophrenics posting crazy talk and submitting nonsense code.

Someone contributed a bug fix to one of my projects about 6 months ago. As soon as I saw the pull request, I acknowledged it and reviewed it. I requested some simple changes, he made those changes and then I accepted it. I made it a point to engage with him so that he could get the commits into the repository and be fully credited for the contribution in the git history. His eyeballs rendered one bug shallow, I felt like that was the least I could do. I think this is a good way to improve the situation. It's certainly the way that I would like to be treated.

It's also important to be honest with oneself and the scope of the project. Sometimes people want perfectly reasonable things but the maintainer has no plans to implement them because of limited time or because they feel it's out of scope for the project. I think it's important to be polite but firm in those cases. If there's no chance that a feature will be accepted, maintainers need to make the decision as early as possible and communicate it clearly. That way people won't waste time and effort implementing a feature that will never be accepted.

> And finally, people need to be braver. You are not getting heard when you give up too easily. And if do not get your patch in, maybe create your own project, or maintain your fork, etc.

Absolutely agree. Especially the part where people create their own projects. My website is powered by a fork of an unmaintained templating engine. I actually want to rewrite all of it some day. Hopefully in my own programming language which I'm also working on as often as time allows.


Is "don't go out and actively break programs you used to promise would work" such a great ask?


About what are you talking exactly? What was actively broken?


`__builtin_dynamic_object_size` has been buggy since it was first implemented, breaking programs that relied on the previously-documented behavior of `malloc_usable_size`.

Instead of fixing the bug, the developers decided to remove the documentation and replace it with something else, breaking backwards compatibility.

Now, it's not that we can't ever break backwards compatibility - but it needs to be done with great deliberation and a transition period, and an alternative needs to be provided. I gave an example of an alternative.


This seems not entirely accurate. "malloc_usable_size" was only recommended for use for statistics before the change in man page: "Although the excess bytes can be overwritten by the application without ill effects, this is not good programming practice:" and "The main use of this function is for debugging and introspection". The new version makes it clearer. You can find the change here: https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/...

Also no code is "actively broken" by a documentation change. If it happened to work before it still works today. Also I do not see how __builtin_dynamic_object_size is broken. It works as intended and can be used with conforming code just fine. It is simply not compatible with some questionable use of "malloc_usable_size".


It's not the documentation change that's the problem. It's the fact that they changed implementation-defined behavior to undefined behavior in the first place (then changed the documentation to follow). Or equivalently, they changed "recommended" to "required".

In particular, the "without ill effects" is no longer true. It's possible to #ifdef to detect broken libc/compiler combinations, but I'm not confident that avoiding explicit use of __builtin_dynamic_object_size will prevent optimizations from taking advantage of false assumptions based on __attribute__((malloc)).


For a comprehensive set of recommended compiler options for hardening C and C++ programs, see this OpenSSF guide: https://best.openssf.org/Compiler-Hardening-Guides/Compiler-...


I wish these compiler protocols and interfaces were better documented. They just assume you're using glibc for support. I want to fully integrate them with my freestanding nolibc C projects but I can't figure out how to do it.

The compiler's stack canaries were simple enough. The only issue was the ugly symbols. I requested that a feature be added to override the symbol generated by the compiler so I could use good names instead.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113694

Things like instrumentation and sanitizers though? No idea. Even asked about it on Stack Overflow and got no answers to this day.

https://stackoverflow.com/q/77688456

I assume these object sizing builtins make use of function attributes such as malloc, alloc_align and alloc_size. I've added all of those attributes to my memory allocator but I'm not quite sure if they're doing anything useful.


They will, but glibc also has additional macro wrappers around certain library functions that do explicit checking based on __builtin_object_size or __builtin_dynamic_object_size. The code is public. You could also ask on the gcc mailing list.


I would imagine false positives could be a huge problem. The behavior if a violation is detected is to gracefully terminate the program, so you could end up with more (but less exploitable) crashes than without FORTIFY_SOURCE.


Just to clarify: false positives in the sense that memory safety is violated intentionally as described in the article with relation to realloc and similar hacks, which would normally not cause problems.


If you use a pointer that’s been realloc’d, then I would not be surprised at all if gcc simply deletes the call to realloc.


I would be surprised in the general case. Realloc does have side effects and the size parameter can be known at runtime only, etc.


Unfortunately _FORTIFY_SOURCE=2 is documented as making some conforming program fail, and it seems hard to find precise informations on this subject, so i'd be hesitant to use it, as well as _FORTIFY_SOURCE=3


FWIW Debian uses _FORTIFY_SOURCE=2 for building its packages for a long time now, so there should be plenty of experience there for using the option.


Title should be: GCC's new (2022) fortification level... ;)


Article is from September 17, 2022, back then it was new. The HN title should -as HN guidelines state- include the year (2022) if it is not equal to the current one.


> The HN title should -as HN guidelines state- include the year

Where do they state that?


(2022)


It also doesn’t deliver what it promises, as it doesn’t discuss the cost well. The only thing it states is that it increases code size, but it doesn’t given give numbers, and the header “The gains of improved security coverage outweigh the cost” doesn’t describe its content, which says:

“We need a proper study of performance and code size to understand the magnitude of the impact created by _FORTIFY_SOURCE=3 additional runtime code generation. However the performance and code size overhead may well be worth it due to the magnitude of improvement in security coverage.”


Yes, came here to say this too. Of course it varies, but some measurements would be nice!


Discussed back then:

https://news.ycombinator.com/item?id=32888516 (40 comments)


Just curious, does anyone run with _FORTIFY_SOURCE=3 in production? Did you catch any overflows because of it, and most importantly, is there a noticeable performance degradation?



Thanks, that’s encouraging. I think I will give it a try in my own projects.


This brings up a question. I think we can all agree that detecting an overrun shows a fault in the system. But does it create an error?

Could this could be changed to where the overflow does not cause an abort; rather the next read of that location, without a corresponding legal write causes the abort. A buffer overrun does not mean the answer is wrong, but the use of memory that was overrun will.

In that case, production or not, you would want an abort. The answer is wrong!

(Perhaps this makes no sense. If so, sorry, the idea just came to me after reading the article.)


Overrunning the buffer means your program's behaviour is undefined. Continuing to function at all, much less doing what you wanted, merely means you got lucky with the compiler this time.


I tend to argue that continuing to function after a memory overrun is _unlucky_ because it's better to have it fail loudly so you know to fix it instead of it potentially not noticing subtly incorrect behavior.


Yes, a segfault with a core dump at the point the bug happens is always the optimal scenario.


It's also much easier to fix when you know where the overrun occurs compared to getting an abort on a random access later.


I would encourage the community to find a unit of measurement for security. Every defense can be breached, so every defense has an active region, much like transistors. Question is how to quantify.


Only in movies can every defense be breached. It is in fact possible to write code that can’t be hacked.


If the hardware has no bugs...


> C programs routinely suffer from memory management problems.

Again? There is a typo there. - programs + programmers


Oddly the example doesn't work on my GCC 13.3.1, regression or just bad copy paste?


Same here for clang 15 :(


Okay ... clang doesn't have it implemented sry


article date 2022

the new feature appeared in 2021




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: