I think the biggest problem is people conflating "undefined" with "unknowable". ...

tsimionescu · 2024-08-04T07:27:46 1722756466

This is dead wrong, and a very dangerous mindset.

All modern C and C++ compilers use the potential of UB as a signal in their optimization options. It is 100% unpredictable how a given piece of code where UB happens will actually be compiled, unless you are intimately familiar with every detail of the optimizer and the signals it uses. And even if you are, seemingly unrelated changes can change the logic of the optimizer just enough to entirely change the compilation of your UB segment (e.g. because a function is now too long to be inlined, so a certain piece of code can no longer be guaranteed to have some property, so [...]).

Your example of signed integer overflow is particularly glaring, as this has actually triggered real bugs in the Linux kernel (before they started using a compilation flag to force signed integer overflow to be considered defined behavior). Sure, the compiler compiles all signed integer operations to processor instructions that result in two's complement operations, and thus overflow on addition. But, since signed integer overflow is UB, the compiler also assumes it never happens, and optimizes your program accordingly.

For example, the following program will never print "overflow" regardless of what value x has:

  int foo(int num) {
    int x = num + 100;
    if (x < num) {
      printf("overflow occured");
    }
    return x;
  }

In fact, you won't even find the string "overflow" in the compiled binary, as the whole `if` is optimized away [0], since per the standard signed integer overflow can't occur, so x + 100 is always greater than x for any (valid) value of x.

[0] https://godbolt.org/z/zzdr4q1Gx

deaddodo · 2024-08-04T23:07:41 1722812861

> This is dead wrong, and a very dangerous mindset.

It's "dead wrong" that compilers independently choose to define undefined behavior?

Oh, ok; I guess I must just be a stellar programmer to never have received the dreaded "this is undefined" error (or it's equivalent) that would inevitably be emitted in these cases then.

tsimionescu · 2024-08-05T05:10:58 1722834658

I've explained and showed an example of how compilers behave in relation to UB. They don't typically "choose to define it", they choose to assume that it never happens.

They don't throw errors when UB happens, they compile your program under the assumption that any path that would definitely lead to UB can't happen at runtime.

I believe you think that because signed integer addition gets compiled to the `add` instruction that overflows gracefully at runtime, this means that compilers have "defined signed int overflow". I showed you exactly why this is not true. You can't write a C or C++ program that relies on this behavior, it will have optimizer-induced bugs sooner or later.

deaddodo · 2024-08-07T20:34:54 1723062894

No, I said it had consistent behavior to the compiler.

You seem to think I'm saying "undefined behavior means nothing, ignore it"; when what I'm actually saying is "undefined behavior doesn't mean the compiler hasn't defined a behavior and doesn't necessarily = 'bad'". There's dozens of "UB" that C (and C++) developers rely on frequently, because the compilers have defined some behavior they follow; to the point critical portions of the Linux Kernel rely on it (particularly in the form of pointer manipulations).

TL;DR - Is UB unsafe to rely on generally? Yes. Should you ignore UB warnings? Definitely not. Does UB mean that the compiler has no idea what to do or is lacking some consistent behavior? Also, no.

Know your compiler, and only write code that you know what it does; especially if it's in a murky area like "UB".

brabel · 2024-08-04T08:00:21 1722758421

Isn't this a terrible failure of the compiler though? Why is it not just telling you that the `if` is a noop?? Damn, using IntelliJ and getting feedback on really difficult logic when a branch becomes unreachable and can be removed makes this sort of thing look like amateur hour.

PhilipRoman · 2024-08-04T08:22:51 1722759771

    if(DEBUG) {
       log("xyz")
    }

Should the compiler emit a warning for such code? Compilers don't behave like a human brain, maybe a specific diagnostic could be added by pattern matching the AST but it will never catch every case.

robinsonb5 · 2024-08-04T08:37:58 1722760678

There's a world of difference between code that's dead because of a static define, and code that's dead because of an inference the compiler made.

A dead code report would be a useful thing, though, especially if it could give the reason for removal. (Something like the list of removed registers in the Quartus analysis report when building for FPGAs.)

umanwizard · 2024-08-05T03:25:03 1722828303

> There's a world of difference between code that's dead because of a static define, and code that's dead because of an inference the compiler made.

Not really, that’s the problem. After many passes of transforming the code through optimization it is hard for the compiler to tell why a given piece of code is dead. Compiler writers aren’t just malicious as a lot of people seem to think when discussions like this come up.

robinsonb5 · 2024-08-05T14:49:26 1722869366

Yeah, I know the compiler writers aren't being deliberately malicious. But I can understand why people perceive the end result - the compiler itself - as having become "lawful evil" - an adversary rather than an ally.

ncruces · 2024-08-04T09:04:10 1722762250

  #define DEBUG i<num+100

The example is silly, but you should get the point. DEBUG can be anything.

robinsonb5 · 2024-08-05T14:44:54 1722869094

Fair point, however your example is a runtime check, so shouldn't result in dead code.

(And if DEBUG is a static define then it still won't result in dead code since the preprocessor will remove it, and the compiler will never actually see it.)

EDIT: and now I realise I misread the first example all along - I read "#if (DEBUG)" rather than "if (DEBUG)".

anon-3988 · 2024-08-04T12:03:55 1722773035

I am guessing there would be a LOT of false negatives of compilers removing dead code for good reason. For example, if you only use a portion of a library's enum then it seems reasonable to me that the compilers optimizes away all the if-else that uses those enums that will never manifest.

thayne · 2024-08-04T15:21:08 1722784868

I don't think it is unreasonable to have an option for "warn me about places that might be UB" that would tell you if it removes something it thinks is dead because it assumed UB doesn't happen?

tsimionescu · 2024-08-04T08:05:36 1722758736

That's the core of the complaints about how modern C and C++ compilers use UB.

uecker · 2024-08-04T08:29:56 1722760196

The focus was certainly much more on optimization instead of having good warnings (although any commercial products focus on that). I would not blame compiler vendors exclusively, certainly paying customer also prioritized this.

This is shifting though, e.g. GCC now has -fanalyzer. I does not detect this specific coding error though, but for example issues such as dereferencing a pointer before checking for null.

userbinator · 2024-08-04T09:38:20 1722764300

Yes it is. Don't let those who worship the standard gaslight you into thinking any differently.