The NULL check was optimized out. The dereference of foo_ptr was hidden behind TRACE_FOO, which made it even harder to spot. I spent hours on that one :-)
Check p before using it. The issue is when you have code like the following, where the behavior of the first statement is only defined when the condition in the second is false:
int x = *p;
if (p == NULL) {
// p can't be NULL without having triggered
// undefined behavior in the first line, so
// this code is removed by the compiler.
return;
}
// ...
The fix is to stop dereferencing p before you know whether it's NULL or not:
if (p == NULL) {
// Moving the check before the dereference
// avoids undefined behavior and resolves
// the issue.
return;
}
int x = *p;
// ...
It is worth noting that unsigned overflow is guaranteed to be defined as 2's complement (wrapping) overflow, at least by Clang and other commonly available C compilers.
This is actually guaranteed by the standard (although the language used is not "2s complement", but "repeatedly adding or subtracting one more than the maximum value that can be stored in the type until the value is in range").
C requires that these sorts of type conversions happen through unions: using pointer casts is not correct and undefined behavior results.
Actually, loading a member of a union other than the one most recently stored to is undefined behaviour too - it's just that using a union in this way falls into the "...both Clang and GCC nail down a few behaviors that the C standard leaves undefined." category. (And most other compilers besides - there is much historical usage of this).
It would be nice to have a code snippet for each of the examples. I'm a fairly experienced C++ developer but my knowledge of compilers is, admittedly, lacking and I just want to make sure I'm on the same page.
Otherwise, a great read.
PS– Is it just me or is LLVM coming up more and more these days?
The original piece of C code has undefined behaviour, meaning LLVM can generate anything it wants. It happens to generate ud2 instructions (because it's better to crash hard and fast) but it could just as well print "puppies puppies puppies" a million times.
I'm not so sure about that. If you need to squeeze out a little more performance, code that is "technically undefined" can be more portable than dropping to ASM.
I think LLVM should emit a warning on code with undefined semantics and generate DWIM instructions instead of UD2s.
It really depends on your motives. If you will be needing to port to new platforms in the future, it's better to have a hard-and-fast crash now, so you can learn and avoid the undefined behavior now, rather than face a large bug backlog years down the line.
But you're correct that sometimes it can be expedient to exploit such technically undefined behavior. (I've committed this sin myself, most commonly in serializers/deserializers)
To make him or her aware of the issue? I'd rather have my program crash at a well identifiable point in the execution flow than start acting rogue for no obvious reason (in both debug and production environments).
It was roughly this
read-from-p;
if(p != NULL) write-to-p;
since read-from-p is undefined if p is null, gcc could (and did) legally optimize out the NULL check, so you could end up writing to NULL.
[edit] I noticed that this case of bug is actually mentioned in the regehr article that is linked.