I Do Not Know C: Short quiz on undefined behavior (2015)

aidanhs · on Feb 14, 2017

My 'favourite' bit of surprising (not undefined) behaviour I've seen recently in the C11 spec is around infinite loops, where

void foo() { while (1) {} }

will loop forever, but

void foo(int i) { while (i) {} }

is permitted to terminate...even if i is 1:

> An iteration statement whose controlling expression is not a constant expression, that performs no input/output operations, does not access volatile objects, and performs no synchronization or atomic operations in its body, controlling expression, or (in the case of a for statement) its expression-3, may be assumed by the implementation to terminate

To make things a bit worse, llvm can incorrectly both of the above terminate - https://bugs.llvm.org//show_bug.cgi?id=965.

pcvarmint · on Feb 14, 2017

It means that empty loops (loops with empty bodies) can be completely removed if the controlling expression has no side effects.

> This is intended to allow compiler transformations such as removal of empty loops even when termination cannot be proven.

It means while(i) {} can be eliminated as if i were 0, because there are no side effects in the loop expression or the loop body, and what would be the point of the loop if it never terminated on a non-constant expression?

As an optimization, the optimizer is allowed to eliminate it as a useless loop with no side effects. If you really want an infinite loop, you can use while (1) {}.

There are cases where automatically generated C code might have empty loops which are useless.

If you really want to go to sleep, use pause() or similar. An infinite loop eats up CPU cycles.

TorKlingberg · on Feb 14, 2017

It's quite common in embedded systems to have the fault handler end with an infinite loop, to give the programmer a chance to attach a debugger an inspect the call stack. Sometimes this behavior is turned on or off with a debug flag, which can trigger this unexpected optimization if the flag is not a compile time constant.

pcvarmint · on Feb 16, 2017

A halt instruction (HLT on x86) might also work.

speleo_engr · on Feb 14, 2017

You could also add a volatile declaration

boomlinde · on Feb 14, 2017

You could always say

    if (flag) while (1);

TorKlingberg · on Feb 14, 2017

Yes, that would make more sense. I have never seen this optimization actually bite anyone.

rkv · on Feb 14, 2017

> If you really want to go to sleep, use pause() or similar. An infinite loop eats up CPU cycles

Yes but an infinite loop + sleep is okay, right?

dfox · on Feb 14, 2017

From compiler's POW call to sleep() is side-effect.

fmap · on Feb 14, 2017

This definition is actually required for the correctness of many standard compiler optimizations such as partial redundancy elimination and code motion.

adamnemecek · on Feb 14, 2017

What's the point of this?

mnarayan01 · on Feb 14, 2017

If the optimizer can determine that "nothing happens" in the loop, it can optimize the loop away without attempting to determine whether or not the loop terminates.

bcoates · on Feb 14, 2017

It allows the optimizer to assume away the halting problem; all nontrivial loops are obligated to halt.

adamnemecek · on Feb 14, 2017

What's the point of this though? Why are you letting programmers write non-functional code? When does the loop exactly terminate? I'm guessing the standard discusses this but at this point idk if I care about memorizing more C trivia.

CJefferson · on Feb 14, 2017

It's more useful in C++. Imagine a templated vector destructor:

    for(T* ptr = begin; ptr != end; ++ptr)
    {  destroy(ptr); }

For many types, 'destroy' will be empty (for example integers), so this just turns into:

    for(T* ptr = begin; ptr != end; ++ptr)
    { }

Which is an empty loop and can be optimised away -- assuming we can prove it isn't an infinite loop! Which can be quite hard sometimes (in this case, it needs that 'end' is some multiple of sizeof(T) bigger than begin).

Now in many cases you can prove these loops are finite, but in full generality it's quite hard, and it was considered easier to just let the compiler remove them.

dbaupp · on Feb 14, 2017

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1528.htm discusses it in some detail.

msbarnett · on Feb 14, 2017

> What's the point of this though? Why are you letting programmers write non-functional code?

He just told you. Because the only way to prevent it in general is to solve the halting problem.

> When does the loop exactly terminate?

In the general case this is provably impossible to determine.

(all you're seeing here is that the compiler authors felt no need to add special case logic to handle "trivial" cases of the halting problem. If the compiler sees any expression in a loop test, it assumes the loop will halt some of the time)

adamnemecek · on Feb 14, 2017

> He just told you. Because the only way to prevent it in general is to solve the halting problem.

I'm aware. I think that I might have misunderstood what the optimization really does.

mnarayan01 · on Feb 14, 2017

The idea (I assume) is to let compilers optimize away loops without determining that they terminate. I.e. the rule is more aimed at loops which would terminate, but this lets compilers avoid proving that they do in fact terminate.

msbarnett · on Feb 14, 2017

Grandparent presented this as a "surprising optimization" but I'd argue it's exactly what you'd expect when a compiler sees while(expression) -- he just happened to pick a trivial expression.

Gibbon1 · on Feb 14, 2017

I read this as, will confuse the snot out of hapless newbie programmers trying to learn C via stepping though their code in an IDE. While providing no practical benefit to programmers writing production code.

blackflame7000 · on Feb 14, 2017

It is necessary in order for the compiler to do transformation optimizations which do impact production code. Newbies shouldn't be writing production code without guidance anyways IMO.

Thrillington · on Feb 15, 2017

Newbies should probably turn off optimization entirely.

blackflame7000 · on Feb 17, 2017

No argument there, or at the very least whole program optimizations

DSMan195276 · on Feb 14, 2017

I'll be honest, I didn't find any of these to be particularly surprising. If you've been using C and are familiar with strict-aliasing and common UB issues I wouldn't expect any of these questions to seriously trip you up. Number 2 is probably the one most people are unlikely to guess, but that example has also been beaten to death so much since it started happening that I think lots of people (Or at least, the people likely to read this) have already seen it before.

I'd also add that there are ways to 'get around' some of these issues if necessary - for example, gcc has a flag for disabling strict-aliasing, and a flag for 2's complement signed-integer wrapping.

mjevans · on Feb 14, 2017

I don't think #2 has been fully beaten to death yet.

Assuming a platform where you don't segfault (say that 'page 0' variables are valid) and thus runtime does proceed; I still can't think of any /valid/ reason to eliminate the if that follows (focus line 2 in the comments).

Under what set of logic does being able to de-reference a pointer confer that it's value is not 0 (which is what the test equates to)?

In my opinion that is an, often working but, incorrect optimization.

maxlybbert · on Feb 14, 2017

C programmers expect dead code removal. Especially when the compiler also inlines functions (and, of course, inlining makes the biggest impact on short functions; and one way to get short functions is to have aggressive dead code removal). And macros can expand into very weird, but valid, code; so the statement that "nobody would ever write code like that" isn't relevant. The compiler may well have to handle unnatural looking code.

As others have stated, compilers generally don't actually have special case code to create unintuitive behavior if it looks like the programmer goofed.

It's possible and desirable for a compiler to remove branches of "if" statements that it knows at compile time won't ever be true. And, of course, one special case of statically known "if" statements are checks for NULL or not-NULL pointers in cases where the compiler knows that a pointer will never be NULL (e.g., it points to the stack) or will always be NULL (e.g., it was initialized to NULL and passed to a function or macro).

So the standard allows the compiler to say "this pointer cannot be NULL at this point because it was already dereferenced." Either the compiler is right because the pointer couldn't be NULL, or dereferencing the pointer already triggered undefined behavior, in which case unexpected behavior is perfectly acceptable. Some programmers will complain because the compiler won't act sensibly in this case, but C doesn't have any sensible option for what the compiler should do when you dereference a NULL pointer (yes, your operating system may give you a SEGFAULT, but the rules are written by a committee that can't guarantee that there will be an operating system).

mjevans · on Feb 15, 2017

OK, I think I see what SHOULD happen here.

It's been forever since I've actually compiled a C program not part of some package I was installing, BUT.

* There should be a warning flag, and it should be ON BY DEFAULT, that //all// code removal is stated, and the logic behind it given.

* It should be possible to elevate that condition to an error (and that too might even be the default).

maxlybbert · on Feb 15, 2017

I have no idea if it's true, but compiler implementers swear that their compilers perform so many optimizations, and that those optimizations allow additional optimizations, that this kind of approach would bury you in messages.

vyodaiken · on Feb 14, 2017

C programmers should be able to expect that "optimizations" will not transform program meaning. And because C is so low level, certain types of optimizations may be more difficult or impossible. If the pointer was explicitly set to NULL, the compiler can justifiably deduce the branch will not be taken but the deduction "if the programmer dereferenced the pointer it must not be NULL" is not based on a sound rule. In fact, the whole concept that the compiler can make any transformation it wants in the presence of UB is wacky. Optimization should always be secondary to correctness.

msbarnett · on Feb 14, 2017

> C programmers should be able to expect that "optimizations" will not transform program meaning.

If x is null, the program in #2 has no meaning. The only way to preserve its meaning is to assume x is not null.

> Optimization should always be secondary to correctness.

If x is null, the program in #2 has no correctness. The only way to salvage its correctness is to assume x is not null.

vyodaiken · on Feb 14, 2017

Exactly. You are assuming that a poorly written standard is correct and engineering practice of working programs is incorrect.

msbarnett · on Feb 14, 2017

Ok, so you want compilers to generate a translation for program #2 that works "correctly" to your mind when x is null.

Please explain to the class the meaning of the construct int y = *x; when x is null, so that all conforming C compilers can be updated to generate code for this case "correctly".

yorwba · on Feb 14, 2017

Something like

    assert(x);
    int y = *x;

is probably closer to the intended meaning. Checking if(!x) afterwards is still dead code, but at least the program is guaranteed to fail in a defined manner.

Of course, if implemented this way, every dereference would have to be paired with an assert call, bringing the performance down to the level of Java. (While bringing the memory safety up to the level of Java.)

to3m · on Feb 14, 2017

If the code for program #2 isn't enough to describe the desired behaviour, perhaps it isn't expressible in terms of standard C.

Here's what "x86-64 clang 3.9.1" gives for foo with -O3: (https://godbolt.org/g/0xe1OD)

    foo(int*):                               # @foo(int*)
            test    rdi, rdi
            je      .LBB0_1
            jmp     bar()                 # TAILCALL
    .LBB0_1:
            ret

(You might like to compare this with the article's claims.)

More the sort of thing that I'd expect to be generated (since it's a more accurate rendering of the C code):

    foo(int*):                               # @foo(int*)
            mov     eax, [rdi]
            test    rdi, rdi
            je      .LBB0_1
            jmp     bar()                 # TAILCALL
    .LBB0_1:
            ret

I know that NULL is a valid address on x64 (it's zero), and on any system that doesn't completely suck it will be unmapped. (If I'm using one of the ones that sucks, I'll already be prepared for this. But luckily I'm not.) So I'd like to feel confident that the compiler will just leave the dereferences in - that way, when I make some horrid mistake, I'll at least get an error at runtime.

But rather than compile my mistakes as written, it seems that the compiler would prefer to double down on them.

scatters · on Feb 14, 2017

If you want the compiler to leave the dereferences in, use `-fno-delete-null-pointer-checks` or ask your compiler vendor for an equivalent option. Compilers delete null pointer checks by default (on typical platforms) because it's what most users want.

lmm · on Feb 14, 2017

> Compilers delete null pointer checks by default (on typical platforms) because it's what most users want.

It's what users think they want (it leads to e.g. higher numbers on meaningless microbenchmarks).

to3m · on Feb 14, 2017

But the null pointer check was left in. (And, sure enough, adding -fno-delete-null-pointer-checks makes no difference.)

yorwba · on Feb 14, 2017

That's because y is never used, so why should the pointer be dereferenced? If you call bar(y) instead of just bar(), "x86-64 clang 3.9.1" with -O3 does the load as well (but after the check)

    foo(int*):                               # @foo(int*)
        test    rdi, rdi
        je      .LBB0_1
        mov     edi, dword ptr [rdi]
        jmp     bar(int)                 # TAILCALL
    .LBB0_1:
        ret

Only GCC does the kind of aggressive optimization the article mentions (and might need to be tamed by -fno-delete-null-pointer-checks).

to3m · on Feb 14, 2017

Ha... yes, a good point. That's a reasonable reason not to generate the load, and stupid me for not noticing. What's also dumb is that my eyes just glossed over the ud2 instruction that both put in main too. The program (not unreasonably) won't even run properly anyway.

gcc does seem to be keener than clang to chop bits out - I think I prefer clang here. But let's see how I feel if I encounter this in anger in a non-toy example ;) I must say I'm still a bit suspicious, but I can't really argue that this behaviour is especially surprising here, or difficult to explain.

msbarnett · on Feb 14, 2017

That's the opposite of what he wants. He wants a compiler that produces a translation which derefrences the null and follows the true branch.

He wants null to equal a valid addressable address, which is completely nonstandard and not portable to everywhere C is used, to be standard that portable compilers emit code for.

He wants to imagine that null and zero are the same thing.

vyodaiken · on Feb 14, 2017

Nobody wants that.

mattkrause · on Feb 14, 2017

> In fact, the whole concept that the compiler can make any transformation it wants in the presence of UB is wacky.

That's the way it's often explained but it's not really what happens--the compiler doesn't scan for undefined behavior and then replace it with random operations. Instead, it's applying a series of transformations that preserve the program's semantics if the program stays "in bounds", avoiding invoking undefined behaviors.

I agree that equating "if the programmer dereferenced the pointer" and "the pointer must not be NULL" betrays a....touching naivety about the quality of a lot of code, but if you start from the premise that the programmer shouldn't be doing that, the results aren't totally insane.

vyodaiken · on Feb 14, 2017

"touching naivete"is a good way of putting it.

maxlybbert · on Feb 14, 2017

> C programmers should be able to expect that "optimizations" will not transform program meaning.

That's the official rule, but it's "program meaning as defined by the standard." It's not perfect, but nobody's come up with a better alternative. We get bugs because programmers expect some meaning that's not in the standard. But compilers are written according to the standard, not according to some folklore about what reasonable or experienced programmers expect.

*

Again, the idea isn't that the compiler found a mistake and will do its best to make you regret it. Derefencing a pointer is a clear statement that the programmer believes the pointer isn't NULL. The standard allows the compiler to believe that statement. Partly because the language doesn't define what to do if the statement is false.

nwmcsween · on Feb 14, 2017

> But compilers are written according to the standard.

Written to the writers _interpretation_ of the standard. I bet money that every compiler written from a text standard hasn't followed said standard. It would be nice if a standard included code fragments used to show/test the validity of what is stated.

adrianN · on Feb 14, 2017

There are examples in the standard.

vyodaiken · on Feb 14, 2017

Actually that's not correct. The standard says the behavior is up to the compiler. The compiler author took that as a license to produce a non truth preserving transformation of the code. The actual current clang behavior also satisfies the standard as written.

maxlybbert · on Feb 14, 2017

> The standard says the behavior is up to the compiler.

I think this statement is correct, but it's the kind of thing people say when they confuse implementation defined behavior and undefined behavior. And that distinction is key.

Implementation defined behavior means the compiler gets to choose what it will do, document the choice, and then stick to it.

Undefined behavior means that the program is invalid, but the compiler isn't expected to notice the error. Whatever the compiler spits out is acceptable by definition. The compiler can generate a program that doesn't follow the rules of C; or that only does something weird when undefined behavior is triggered, but the weird behavior doesn't take place on the same line as the undefined behavior; etc.

It's certainly true that "the compiler isn't expected to notice the error" doesn't prohibit a compiler from noticing the error. A compiler can notice, but it's standard conforming even if it doesn't.

I should probably mention that when I say "the standard" I mean the C language standard; the POSIX standard may add requirements such that something undefined according to the language is well defined on POSIX.

vyodaiken · on Feb 14, 2017

So the standard does not require the compiler to e.g. remove the "redundant" check for null or assume that signed integers don't cycle values on overflow, but _permits_ the compiler to do so. Thus, we have two problems: a poorly thought out standard which permits dangerous compiler behavior and poorly thought out compiler that jumps at the chance.

maxlybbert · on Feb 14, 2017

I never said the standard required the compiler to remove the redundant check; just that it is allowed to, and I gave an example of why it might.

But since the compiler is allowed to, programmers have to act accordingly.

*

Yes, this does lead to a situation similar to what C.A.R. Hoare described:

"Now let me tell you about yet another overambitious language project. ... I was a member and even chairman of the Technical Committee No. 10 of the European Computer Manufacturers Association. We were charged ... with ... the standardization of a language to end all languages. ... I had studied with interest and amazement, even a touch of amusement, the four initial documents describing a language called NPL. ... Each was more ambitious and absurd than the last in its wishful speculations. Then the language began to be implemented and a new series of documents began to appear at six-monthly intervals, each describing the final frozen version of the language, under its final frozen name PL/1.

"But to me, each revision of the document simply showed how far the initial Flevel implementation had progressed. Those parts of the language that were not yet implemented were still described in free-flowing flowery prose giving promise of unalloyed delight. In the parts that had been implemented, the flowers had withered; they were choked by an undergrowth of explanatory footnotes, placing arbitrary and unpleasant restrictions on the use of each feature and loading upon a programmer the responsibility for controlling the complex and unexpected side-effects and interaction effects with all the other features of the language" ( http://zoo.cs.yale.edu/classes/cs422/2014/bib/hoare81emperor... , pg. 10).

In The Design and Evolution of C++, Stroustrup mentioned that occasionally when designing a feature, he could think of multiple possibilities: often one would involve convoluted rules that the compiler could enforce, and the other would be a simple rule (or set of rules) that the compiler couldn't necessarily enforce. He said he generally chose the simple rules, even if he didn't know how to have the compiler detect violations. So C++ ended up with things like the One Definition Rule (and violation is undefined behavior). I've never seen any similar statements from Ritchie or Thompson, but I suspect they followed a similar approach. Of course, today both languages are governed by committees, so how they balance trade-offs may have changed.

msbarnett · on Feb 14, 2017

> Under what set of logic does being able to de-reference a pointer confer that it's value is not 0 (which is what the test equates to)?

You're conflating null and zero (which C encourages you to do for various terrible reasons). The test does not test that x is not zero; it tests that x is not null (null, like zero, is falsey, but again, null is not to be mistaken for zero), which in C is sometimes represented by the character '0' but which legally can be totally distinct from the bit-pattern zero and which should be thought of as totally distinct. Zero can be a valid address in memory. Null is never a valid address in memory. The integral type with value zero when cast to a pointer is guaranteed to be the null pointer (which may have a different bit pattern than zero!). Casting non-integral types that happen to have the value zero to a pointer is not guaranteed to produce a null pointer. Confused yet?

The compiler isn't. It knows that you're testing that a pointer is not null.

Since x has already been derefrenced, and since derefrencing x has no translatable meaning if x is null, it follows that we can only produce a meaningful translation of this program iff x is not null.

It therefore follows that x must not be null in the test, since x has not changed.

vyodaiken · on Feb 14, 2017

I'm sorry, but that is not what the current standard, no matter its weaknesses, requires. The standard says that dereferencing a null pointer results in undefined behavior. The compiler writer chose to (a) deduce that the pointer dereference was not an oversight and (b) then produce a pointless "optimization" that compounded the error. Compiling the dereference to a move instruction is perfectly compatible with the standard.

E6300 · on Feb 14, 2017

> Under what set of logic does being able to de-reference a pointer confer that it's value is not 0 (which is what the test equates to)?

Simple: undefined behavior makes all physically possible behaviors permissible.

In reality though, such an elimination would only be correct if the compiler was able to prove that the function is ever called with NULL, and if the compiler is smart enough to do that, hopefully the compiler writers are not A-holes and will warn instead of playing silly-buggers.

vyodaiken · on Feb 14, 2017

This is a kind of Department of Motor Vehicles Bureaucrat thinking. For example, there as many thousands of lines of C code that reference *0, which is a perfectly good address in some environments. One should be able to depend on compilers following expressed intentions of the programmer and not making silly deductions based on counter-factual assumptions.

E6300 · on Feb 14, 2017

> This is a kind of Department of Motor Vehicles Bureaucrat thinking.

Sorry, but modern compilers are basically automatic theorem provers. They'll use whatever means necessary to get every last drop of performance. If you play cowboy with them you'll just get hurt.

> For example, there as many thousands of lines of C code that reference *0, which is a perfectly good address in some environments.

It's permissible for a particular platform to define behaviors that the standard has left undefined. If you try to take that code and run it elsewhere, that's your problem.

vyodaiken · on Feb 14, 2017

If you are going to do theorem proving, you should try to only prove true theorems.

E6300 · on Feb 14, 2017

If you want to phrase it like that, a compiler tries to prove conjectures and performs actions (usually code and data elimination) based on whether it can prove them or their negatives. Sometimes it can't do either.

I don't see where you're going, though.

vyodaiken · on Feb 14, 2017

It's easy to see the problem in the null pointer case. The compiler deduction is that the null test is redundant, but it's not actually redundant. Therefore the compiler "proves" a false theorem. That the standards rules permit the compiler to deduce false things would be, in normal engineering analysis, considered to show a failure in the rules, but the bureaucratic mindset holds that the rules are always, by definition, correct, so the failure is the fault of the miscreant who neglected to staple cover sheet properly.

If the compiler is unable to prove a transformation preserves correctness, it should not do the transformation.

To your point below: The compiler is definitely not "forced" to assume that the pointer is not null - that is a choice made by the compiler writers. Even the ridiculous standard does not require the compiler writer to make that assumption. The compiler can simply compile the code as written - or, if it is smart enough to see the problem - it can produce a warning.

E6300 · on Feb 14, 2017

> Therefore the compiler "proves" a false theorem.

In the axiomatic system implied by the standard, the hypothetical compiler being discussed can prove that the null check can be eliminated. The fact that you believe this axiomatic system is inconvenient does not constitute a refutation of the truth of the theorem.

> If the compiler is unable to prove a transformation preserves correctness, it should not do the transformation.

Actually, the compiler is able to prove the invariance of correctness. Eliminating a null check after a pointer has been dereferenced does in fact preserve the correctness. Either the function is never called with NULL, and the program is correct, or the function is sometimes called with NULL, and the program is incorrect.

> the bureaucratic mindset holds that the rules are always, by definition, correct

Since you think that the compiler following the standard rigorously is "bureaucratic" and, I imagine, bad, it follows that you would prefer the compiler to sometimes ignore what the standard says and do something different. I suggest that you try compiling your code with a compiler for a different language. Compilers for languages other than C are guaranteed to not follow the C standard.

EDIT: I think I see where you're going. Your argument is that, since in some platforms NULL can be correctly dereferenced, if the compiler was to eliminate the null check, that would change the behavior of the program. If a compiler for that platform did that, I would agree that it would not be preserving correctness. A compiler for a given platform is only required to generate correct code for that particular platform. Compilers for platforms where dereferencing a null pointer is always invalid can correctly eliminate that late null check.

vyodaiken · on Feb 14, 2017

the post that brought this to my attention discussed how a security error in Linux was created by this "optimization".

E6300 · on Feb 14, 2017

This? http://blog.regehr.org/archives/970

The compiler didn't create a security bug by removing the null check. The bug was created by the programmer when he didn't check for null before dereferencing the pointer. Even with the check, the program contained a bug.

vyodaiken · on Feb 14, 2017

The compiler converted a buggy program that was prevented from opening a security hole by defense in depth into a program with a security hole. It transformed a careless error into a systemic error, all in the cause of a micro-optimization that didn't.

E6300 · on Feb 14, 2017

What are you talking about? Dereferencing invalid memory is a security bug.

vyodaiken · on Feb 14, 2017

Not necessarily.

E6300 · on Feb 14, 2017

In what cases it's not?

vyodaiken · on Feb 14, 2017

In the referenced case the introduced error involved a reference to a null pointer but there was still no exploitable security hole. The exploit was enabled when the compiler removed an explicit check. The null dereference was an error, but it was not a security issue on its own.

E6300 · on Feb 14, 2017

Why didn't the kernel panic when it tried to access NULL?

vyodaiken · on Feb 14, 2017

Why is a compiler writer attempting to legislate the kernel address space?

E6300 · on Feb 15, 2017

Give me a break. The kernel developers know C. They know what "undefined behavior" means.

msbarnett · on Feb 14, 2017

> The compiler deduction is that the null test is redundant, but it's not actually redundant.

No. The compiler is forced to assume x isn't null, because int y = *x; has no meaning iff x IS null, so the compiler can't possibly generate any code to cover that case. There's no definition of that construct for the compiler to work off of that could possibly allow x to be null.

Blame the standard if you want, but you can't blame the compiler for not generating code to cover behaviour that you've made up in your head.

cronjobber · on Feb 14, 2017

Nope, "undefined" leaves the compiler perfectly free to just Do The Right Thing.

vyodaiken · on Feb 14, 2017

I didn't make it up in my head - I observed it in working code. Widely used working code. And the compiler is not at all forced to assume x is not null. The standard leaves it to the compiler writer to handle the case. Could the compiler perform sophisticated static analysis and reject the code under the standard? Yes. Could the compiler simply compile the code as written ? Yes. Could the compiler abuse the vagueness of the standard to produce an "optimization" that implements something the programmer specifically did not implement? I suppose. But that's poor engineering.

cronjobber · on Feb 14, 2017

> the bureaucratic mindset holds that the rules are always, by definition, correct

It has long been known that you can get better optimization if you disregard correctness ;)

C compiler writers know that the assumption "pointer is not null because it was dereferenced" doesn't hold in general. C compiler writers know that they're performing incorrect optimizations.

The bureaucracy now doesn't tell them that the transformation is correct. It tells them it is fine to be incorrect because UB.

The bureaucracy gives them a "license to kill" for the greater good.

(What is the greater good, you ask? Can't answer that; ask a bureaucrat.)

adrianN · on Feb 14, 2017

If you don't like the "ridiculous" standard, maybe you shouldn't be writing the language that it defines. There are plenty of discussions online what parts of the standard should be changed to get a "friendly C" [1], unfortunately there is no consensus that could be implemented.

[1] http://blog.regehr.org/archives/1287

vyodaiken · on Feb 14, 2017

My prediction is that the standard will eventually follow the change in practice and eliminate that compiler "optimization".

E6300 · on Feb 14, 2017

In order to do that, the standard would have to define that dereferencing a null pointer must produce a deterministic behavior. There are only two possible behaviors:

1. The program successfully reads/writes that memory location and retrieves/overwrites whatever is there without crashing. Then the program can continue on and execute the if even if the pointer was null.

2. The program crashes immediate whenever a null pointer is read/written.

#1 is problematic, because NULL is a single pointer value that can be applied to a variety of pointer types. What happens if you first write to (long )NULL and then read from (FILE )NULL?

#2 is very useful, and most platforms already crash any program that tries to read or write NULL. But if the standard requires this behavior, then this introduces an even stronger guarantee that a dereferenced pointer is not null, so there's no reason to remove that optimization.

vyodaiken · on Feb 14, 2017

C is not Haskell or Java. The C programmer may intend to interact with actual hardware and is not required to interact with some abstract machine. The standard can reflect this or it can attempt to convert C into a poorly designed high level language. Dereferencing the null pointer should be implementation dependent, but the compiler should be required to either detect and flag this as an error or compile it into the machine operations indicated. The actual execution in the second case may depend on the environment.

E6300 · on Feb 14, 2017

Sorry, but you are just wrong. The C standard does define an abstract machine.

> but the compiler should be required to either detect and flag [dereferencing the null pointer] as an error

How could the compiler detect at compile time the value of a run time variable? Sure, some instances might be detectable, but those are the extreme minority. Static analysis tools such as Clang are already capable of finding those compile time NULLs.

> or compile it into the machine operations indicated. The actual execution in the second case may depend on the environment.

Which is exactly what's done now. In most platforms accessing NULL causes a crash, so either the pointer is not null and the program doesn't crash, so the check in redundant; or the pointer is null and the does crash, so the check is never executed.

mfukar · on Feb 14, 2017

Compilers operate under rules. The rules in this case say the null test is redundant.

You could argue the rules are "wrong", but that's a totally different topic.

lmm · on Feb 14, 2017

This battle has been fought and lost. If you require sensible behaviour, just move on and use a language that offers it. C compilers will do what makes them look good on benchmarks, and various "friendly C" efforts have been tried and failed.

vyodaiken · on Feb 14, 2017

Au contraire - gcc and clang both appear to do the right thing now.

mfukar · on Feb 14, 2017

https://godbolt.org/g/lqmNLh

mjevans · on Feb 14, 2017

You can't though. It's always possible to re-link the objects.

E6300 · on Feb 14, 2017

You can if, for example, the function is static. Also there could be a link-time optimization pass. The linker can see all calls to that function, unless the function is exported.

DSMan195276 · on Feb 14, 2017

You're thinking about it in the context of actual computers. The C standard says absolutely nothing about what NULL has to be, besides that the integer value 0 is considered to be the NULL address and that dereferencing it is considered invalid. The NULL address does not have to be all 0 bits. Architectures are generally free to define it to any invalid address they want to be NULL, 0 just happens to be a common and easy one. The catch you're pointing out is that on x86 there are technically no 'invalid' addresses, so we just use 0 and assume you won't ever attempt to use the stuff there (Which in practice on x86, nobody puts anything there).

sjolsen · on Feb 14, 2017

>The catch you're pointing out is that on x86 there are technically no 'invalid' addresses

Depending on what you mean precisely by "x86," there is such a thing as an invalid address: the IA32e architecture (or whatever you want to call Intel's flavour of 64-bit "x86") requires that the <n> high bits of an address match, where <n> is machine-dependent.

DSMan195276 · on Feb 14, 2017

That's a fair point - though I think you could debate whether or not the current 48-bit address space of x86-64 is part of the architecture or just an implementation detail. But in the end I don't really think it matters which you consider it (And I'd be happy to consider it to be either). All that said, you're completely right that with current x86-64, there are 64-bit values that could never be a valid address.

wahern · on Feb 14, 2017

  Under what set of logic does being able to de-reference a pointer confer that it's value is not 0 (which is what the test equates to)?

Normal deductive logic?

  * No NULL pointer can be dereferenced. 
  * x is dereferenced.
  * Therefore, x is not a NULL pointer.

Of course, the compiler is presuming that your code is correct. That's a reasonable presumption when dealing with computer programming languages. Programming languages would be rather hard to interpret and translate--not to mention impossible to optimize--if you couldn't apply basic deductive logic to their statements.

Imagine the routine had this code, instead:

  void foo (int *x) {
    if (*x != *x) {
    {
      return;
    }
    bar();
    return;
  }

wouldn't you expect the compiler to apply the same optimizations? Or would you be upset that eliding the check broke some code that depended on a race condition somewhere else in your program?

Also, pointing out that the "value is not 0 (which is what the test equates to)" is a non-sequitur. During compilation the literal 0 can behave as a NULL pointer constant. But the machine representation of a NULL pointer does not need to be all-bits 0, and such machine still exist today. And usually, as in this case, the distinction is irrelevant. It doesn't matter that the 0th page is mappable on your hardware. What matters is that the C specification says that a NULL pointer cannot be dereferenced; that dereferencing a NULL pointer is non-sense code.

There's an argument that compilers should be careful about the optimizations they make. Not all programs are correct, and taking that presumption too far can be detrimental. But it's not always trivial to implement an optimizing compiler to "do what I say, not what I mean". Optimizations depend on the soundness of being able to apply deductive logic to a program--that is, being able to string together a series of simple predicates to reach a conclusion about program behavior. You often have to add _more_ complexity to a compiler to _not_ optimize certain syntactic constructs. Recognizing the larger construct, especially only the subset that are pathological, without optimizing the ones everybody expects to actually be optimized, can be more difficult than simply applying a series of very basic deductive rules. So it's no wonder that most compiler implementations, especially high-performance compilers, tend to push back on this front.

What would be nice is for compilers to attempt to generate diagnostics when they elide code like that. An optimizer needs to be 100% correct all the time, every time. A diagnostic can be wrong some amount of time, which means it's easier to implement and the implementation of a particular check doesn't ripple through the entire code base.

GCC and clang implement many good diagnostics. But with -Wall -Wextra they also tend to generate alot of noise. Nothing is more annoying than GCC or clang complaining about perfectly compliant code for which there's no chance of it hiding a bug. For example, I used to often wrote initializer macros like:

  #define OPTIONS_INIT(...) { .foo = 1, .bar = 3, __VA_ARGS__ }
  struct options {
    int foo;
    int bar;
  };

allowing applications to have code like:

  struct options opts = OPTIONS_INIT(.bar = 0);

But with -Wall GCC and clang will complain about the second .bar definition overriding the first. (Because the macro expands to { .foo = 1, .bar = 3, .bar = 0 }). The C specification guarantees in no unambiguous terms that the last definition of .bar wins. And presumably they guarantee that precisely to make writing such macros feasible. I've never once had a problem with unintentionally redefine a struct field in an initializer list. Yet GCC and clang are adamant about complaining. It's so annoying especially because 1) there's absolutely nothing wrong with the code and 2) disabling the warning requires a different flag for clang than for GCC.

(I realize that for such option types you usually want define the semantics so that the default, most common value is 0. But it's not always desirable, and certainly not always practical, to be able to stick that mode. And that's just one example of that construct.)

raarts · on Feb 14, 2017

* No NULL pointer can be dereferenced.

NULL pointers CAN be dereferenced, it all depends on the environment you run on.

gpderetta · on Feb 14, 2017

In standard C NULL pointers cannot be dereferenced. Full stop, there is nothing to argue about it.

There are environments that either lack memory protection and do not allow invalid pointer derereferences to be caught, which means that you can't rely on the MMU to catch mistakes. In this case either deal with silent errors or get a compiler that is able (at a cost) to sanitize any pointer access.

There are other systems in which memory address 0 is a perfectly valid address. The ABI of these systems should pick a different bit pattern for NULL pointers, but often don't so compilers sometime offer as a conforming extension extension the option to allow null pointers to be treated as valid (in effect not having null pointers at all).

mjevans · on Feb 15, 2017

There are indeed such systems.

Embedded systems (which are frequently programmed in assembly or 'C' code).

Such systems very often map a small bit of high-speed (on chip) RAM to the first few bytes of address space. I very distinctly recall such an embedded system in a college course.

kelnos · on Feb 14, 2017

Yeah, I agree. I used to write C full time, but haven't in around 6 years, and I only flubbed #11 & #12 (I knew there was undefined behavior but couldn't remember why; after reading the answers I was like "duh", esp for #12 after having read #11).

I've never actually run into #2 in practice, though: even at -O3 the dereference in line 1 has always crashed for me, though I guess probably because I've never written code for an OS where an address of 0 is valid and doesn't cause a SIGSEGV or similar.

What's the best way to "fix" strict aliasing without disabling the undefined behavior around it? Using a union?

DSMan195276 · on Feb 14, 2017

I think the catch with #2 is that we probably run into it more in the opposite way - a unnecessary NULL check put in from, say, inlining a function or expanding a macro is removed. On that note though, even in OS code address 0 is usually setup to cause some type of fault to catch errors. I think the issue happens when the compiler removes or moves the deference somewhere else - though obviously the cases where this is legal are limited. But in the posted code, for example, the `y` variable is unused and thus could be removed entirely, which would also remove the NULL dereference (but still remove the NULL check).

> What's the best way to "fix" strict aliasing without disabling the undefined behavior around it? Using a union?

I was just talking about `-fno-strict-aliasing`, which is a flag for `gcc` (And `clang` I assume), but it does remove all UB like you're saying by simply allowing all pointers to alias.

The other options are unions like you're thinking (Though that's also technically UB, since writing to one union member and reading from another is undefined, though most compilers allow it without incident), or extensions like `gcc`s `may_alias` attribute. The `may_alias` is really the cleanest way to do it, but for something like OS code the aliasing happens in such strange places that just disabling strict-aliasing completely tends to be the way to go.

E6300 · on Feb 14, 2017

> What's the best way to "fix" strict aliasing without disabling the undefined behavior around it? Using a union?

I had this discussion with another C++ programmer and we came to the conclusion that, if you care to avoid that particular UB, any time you cast pointers between unrelated or basic types and you're going to write to one pointer and read from the other, you need to go through a union, as annoying as it is.

wahern · on Feb 14, 2017

Not just a union, but the union definition needs to be in scope _and_ used such so that the compiler can see the possibility of the relationship between the two objects.

But a union doesn't magically make type-punning correct. This code is not correct:

  union {
    int d;
    long long lld;
  } u;

  u.d = 1;
  printf("%lld\n", u.lld);
  u.lld = 0;
  printf("%lld\n", u.lld);

The union ensures that the compiler doesn't move "u.lld = 0" above the first print statement, but usually writing from one type and reading from another is undefined behavior no matter how you accomplish it. That's because the representations can be different, and one or the other might have invalid representations. The biggest exception is reading through a char pointer; reading representation bits through a char pointer is guaranteed to always be okay.

Aliasing and type punning are two different issues that are only tangentially related in terms of language semantics. But the issues do often coincide, especially in poorly written code.

You can also put the compiler on notice not to apply the strict aliasing rule by using simple type coercion (implicit or explicit) in the relevant statements. What matters is that we put the compiler on notice that two objects of [seemingly] different types are related and thus have an ordering relationship, and the standard provides a few ways to do that.

For example, this code is wrong:

  struct foo {
    int i;
  };

  struct bar {
    int i;
  };

  void baz(struct foo *foo, struct bar *bar) {
    foo->i = 0;
    bar->i++:
  }

  struct foo foo;
  baz(&foo, (struct bar *)&foo);

whereas all of

  void baz(struct foo *foo, struct bar *bar) {
    foo->i = 0;
    (((struct foo *)bar)->i)++;
  }

and

  void baz(struct foo *foo, struct bar *bar) {
    union {
      struct foo foo;
      struct bar bar;
    } *foo_u = (void *)foo, *bar_u = (void *)bar;    
    foo_u->foo.i = 0;
    bar_u->bar.i++;
  }

and

  void baz(struct foo *foo, struct bar *bar) {
      *(int *)foo = 0;
      (*(int *)bar)++;
  }

are correct. This should be correct, too, I think

  void baz(struct foo *foo, struct bar *bar) {
      *(int *)&foo->i = 0;
      (*(int *)&bar->i)++;
  }

and is also a weird case where the superfluous cast is necessary.

The purpose in all 4 cases is to make it evident viz-a-viz C's typing system that two objects might alias each other, and they do that by using constructs that put those objects into the same universe of alias-able types.

The conspicuous description of the union method in the C standard is more directed, I think, at compiler writers. It's not the only way to alias correctly (explicit casting to the basic type is enough), but often times it's the most natural when dealing with polymorphic compound objects.

Compiler writers historically didn't always implement enough smarts in their compiler to be able to detect possible aliasing through unions, and that needed to be addressed by a more thorough specification of union behavior. That is, the standard needed to make it clear that a compiler was required grok the relationship of two sub-objects (of the same basic type) that were derived from the same root union type.

Explicitly type-casting through a union just for aliasing is a little stilted, though, when you can achieve the same thing using a cast through a basic type. The union method is preferable, but only in so far as it's used to _avoid_ or to _minimize_ type coercion. And it'll never solve type punning issues.

E6300 · on Feb 14, 2017

> The union ensures that the compiler doesn't move "u.lld = 0" above the first print statement, but usually writing from one type and reading from another is undefined behavior no matter how you accomplish it.

I know, but the only reason aliasing becomes an issue is because someone is trying to cast between unrelated pointer types to perform cheap type conversions. Yes, even with the union the behavior is undefined, but if you know the platform you're targeting the program may be well-behaved.

As for your snippets, yes, casting pointers across function boundaries will work. The problem is when you don't want to introduce a call, which is where unions come in.

the_cap_theorem · on Feb 14, 2017

Yes, like most of the "undefined behaviour allows your computer to format the disk"-style posts this one seems to be written by a programmer with novice-intermediate C knowledge.

What irks me is the intro >> The purpose of this article is to make everyone (especially C programmers) say: “I do not know C”. <<

I think the purpose of the article was mainly for the author to write down some things he learned. Apparently it was his expectation that readers wouldn't be able to answer the quiz.

However, if you can't answer (at least most) of these questions correctly you're _not_ an expert c programmer.

So I think the correct intro here should be "The purpose of this blog post is to to show that if you want to learn C, you actually have to learn it and should not attempt to 'wing it'".

...and maybe also that you should not write patronizing blog posts about a topic which you haven't fully grasped yet yourself.

rwj · on Feb 14, 2017

Not a full-time C programmer, and I was still correct on all of them except #1. Certainly C is more dangerous than other languages, but I don't understand the push to convince people that it is impossible to understand.

deathanatos · on Feb 14, 2017

I don't think most C programmers share your depth of the language. I tried hard to explain strict aliasing once, and utterly failed. The dev was convinced that he knew the exact behavior of the platform, and that it was fine. Yet people constantly find examples where we "know" what the compiler will do, and it does something completely different.

DSMan195276 · on Feb 14, 2017

I would agree that strict-aliasing is a hurt point for a lot of C devs, which is unfortunate. I'd only suggest that in general, if the strict-aliasing rule is coming into play you're probably already doing some really shady to begin with. Like in this example, casting a `long ` to an `int ` is likely a bad way to go about things even without worrying about strict-aliasing. In a lot of ways, I'd say that problems with the strict-aliasing rule are a symptom of a larger problem. If you can convince them that what they're doing is just bad coding practice to begin with, you might have a better time making them write correct code in the long run.

Now if you're working more directly with hardware (Which is of course possible/likely with C) then it might just be easier to disable strict-aliasing all together if you can, since identifying all the spots where it might be a problem tends to be an issue.

caf · on Feb 14, 2017

In the really simple cases (like accessing a float as a long, or similar), you're right.

The problem is the interpretation that's been applied to aggregate types:

  struct a {
    int variant;
  };

  struct b {
    int variant;
    long  data;
  };

I have a pointer to a struct b - can I cast it to a pointer to struct a and still dereference 'variant'? It has the correct type and is guaranteed to live at the start of the struct in both cases. The prevailing opinion seems to be "no" (see https://trust-in-soft.com/wp-content/uploads/2017/01/vmcai.p... eg. example st1).

The BSD socket API was built on exactly this kind of type punning.

DSMan195276 · on Feb 14, 2017

> I have a pointer to a struct b - can I cast it to a pointer to struct a and still dereference 'variant'?

I think it is a bit of a gray area, but personally I've always held the opinion/understanding of no, that is invalid. The C standard does make one point fairly clear on strict-aliasing, which is the idea that strict-aliasing revolves on the idea that an object can only be considered to be one type of data (Or a `char` array, the only exception). Your example would be invalid for the reason that you can't treat an object of `struct b` as though it is a `struct a` - the fact that they share the same preamble doesn't change that. To be clear with what I'm saying: `struct b` can alias an `int`. `struct a` can also alias an `int`. But `struct b` can't alias a `struct a`, and because of this an int accessed through a `struct a` can't be accessed through a `struct b`.

That said, in general I find this to usually be a fixable problem, which also has (IMO) a cleaner implementation:

    struct head {
        int variant;
    };

    struct a {
        struct head h;
    };

    struct b {
        struct head h;
        long data;
    };

Now you can take a pointer to a `struct b` object and treat it like a pointer to a `struct head` object (Because it is a `struct head` object). You could do the same thing with objects of type `struct a`. So now you can cast both of them to `struct head` and examine the `variant` of both. Then later you could cast it back to the proper type.

This approach to aggregate types is heavily used in the Linux Kernel and other places (Including most of my own code). The `container_of` macro makes it even nicer to use (Though the legality of the `container_of` macro is an interesting debate...).

> The BSD socket API was built on exactly this kind of type punning.

Kinda. It's actually surprising how close it comes to skirting this issue (And it does skirt it), but I believe it's actually possible to use BSD sockets without ever violating the strict-aliasing rule (Though of course, there are ways of using it which would arguably violate the rule). In most cases for BSD sockets, strict-aliasing is more likely going to be broken in the kernel, not your code.

To note though, the strict-aliasing rule only applies to dereferencing pointers. You can cast pointers back and forth all day, you just have to ensure that when you're done you're treating it the object as you originally declared it. Thus, if you pass a `struct sockaddr_in` to `bind` and cast it to a `struct sockaddr`, the strict-aliasing rule isn't violated because you never dereferenced the casted pointer.

Going along with that, as long as you correctly declare your `struct sockaddr`s from the beginning you won't have any strict-aliasing woes. The only situation where this could technically be a problem is `accept` and `recvfrom`, since they are the only functions that gives a `struct sockaddr` back. But assuming you already know what address-family the socket is using, you can declare the correct `struct sockaddr` for that family from the start, cast it and pass it to `accept` or `recvfrom`, and then use it as your originally declared type without breaking strict-aliasing.

Of course, it's also worth keeping in mind that the BSD sockets interface came before C89. You definitely wouldn't design it the same way if you were to do it today.

caf · on Feb 15, 2017

Kinda. It's actually surprising how close it comes to skirting this issue (And it does skirt it), but I believe it's actually possible to use BSD sockets without ever violating the strict-aliasing rule (Though of course, there are ways of using it which would arguably violate the rule). In most cases for BSD sockets, strict-aliasing is more likely going to be broken in the kernel, not your code.

Well, firstly it's pretty unsatisfying to hear that yes, this API contravenes strict aliasing restrictions, but only on the library side! - essentially that it is impossible to implement the sockets C API in C.

That aside, this still excludes long-standing examples like embedding a pointer to struct sockaddr in your client struct, which points to either a sockaddr_in, sockaddr_in6 or sockaddr_un depending on where that client connected from (well, you can still do it, but now you can't examine the sockaddr's sa_family member to see what type the address really is - you need to have a redundant, duplicate copy of that field in the client struct itself).

The situation is similar with sockaddr_storage. The whole point of that type is to allow you to stash either AF_INET or AF_INET6 addresses in the same object and then examine the ss_family field to see what it really is - the text in POSIX says:

  The <sys/socket.h> header shall define the sockaddr_storage structure, which shall be:

    Large enough to accommodate all supported protocol-specific address structures

    Aligned at an appropriate boundary so that pointers to it can be cast as pointers to protocol-specific address structures and used to access the fields of those structures without alignment problems

( http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sys... )

Of course, it's also worth keeping in mind that the BSD sockets interface came before C89. You definitely wouldn't design it the same way if you were to do it today.

Well, the aforementioned sockaddr_storage came about after C89.

And wasn't C89 supposed to be about codifying existing practice, anyway?

DSMan195276 · on Feb 15, 2017

> Well, firstly it's pretty unsatisfying to hear that yes, this API contravenes strict aliasing restrictions, but only on the library side! - essentially that it is impossible to implement the sockets C API in C.

Then it'd also be pretty unsatisfying to hear that it's impossible to write an OS kernel in standard C too - it requires compiler extensions for lots of various details. It's impossible to write a standard C library in nothing but standard C as well. I would absolutely agree, however, that the fact that you have to use an extension to get past strict-aliasing is unfortunate (It'd be nice if in a future C they add something like the `may_alias` extension into the standard). But you can do it, and OS code is definitely one where extensions are going to be rampant anyway. For example, the Linux Kernel disables strict-aliasing completely.

> That aside, this still excludes long-standing examples like embedding a pointer to struct sockaddr in your client struct, which points to either a sockaddr_in, sockaddr_in6 or sockaddr_un depending on where that client connected from (well, you can still do it, but now you can't examine the sockaddr's sa_family member to see what type the address really is - you need to have a redundant, duplicate copy of that field in the client struct itself).

Strictly speaking, that's not true. You can examine it directly, but it requires casting the `struct sockaddr * ` to a `sa_family_t * ` instead. This is valid because regardless of what type of object it really is, it must start with a `sa_family_t` entry, so you are treating it as the correct type (And by that notion, `sa_family_t` is allowed to alias any of the `struct sockaddr` types). Besides that, you could also use a `union` to combine all the possible `sockaddr`s that you're going to handle together with a `sockaddr_storage`. Then you can do things like normal and simply access the `sockaddr` through the correct union member.

> Well, the aforementioned sockaddr_storage came about after C89.

That it did. However, that was more about making the current API work with larger addresses (namely ipv6) not to fix the API. There really wasn't any other way to do it.

> And wasn't C89 supposed to be about codifying existing practice, anyway?

Some compilers implemented the strict-aliasing optimization (And did lots of other strange things), some did not. The C standards committee chose to go with strict-aliasing since it provides a lot of optimization opportunities.

All that said, I'm not saying things are perfect by any means, as our conversation here shows. I don't think things are nearly as bad as people tend to think, however. Generally speaking, unless you're doing something a little bit shady it's possible to avoid any strict-aliasing issues, and if it's really not possible there's generally a way to simply "turn it off", albeit that may result in a little less portable code - though if you're breaking strict-aliasing the portability of your code is already a bit suspect to begin with.

caf · on Feb 15, 2017

Strictly speaking, that's not true. You can examine it directly, but it requires casting the `struct sockaddr * ` to a `sa_family_t * ` instead. This is valid because regardless of what type of object it really is, it must start with a `sa_family_t` entry,...

No, that isn't guaranteed by POSIX - it has to have an sa_family_t member, but it doesn't have to be the first one.

I also think it's problematic that the use explicitly contemplated by POSIX is considered ill-formed C.

Besides that, you could also use a `union` to combine all the possible `sockaddr`s that you're going to handle together with a `sockaddr_storage`. Then you can do things like normal and simply access the `sockaddr` through the correct union member.

I do not think this is that easy when you include sockaddr_un into the mix, because of the way that sockaddr doesn't include the full size of its path member. This is, in fact, the point that I throw up my hands and use -fno-strict-aliasing because the fact that pointer provenance, rather than just value and type, is important together with the fact that it's not actually clear whether you've correctly laundered the pointer through a union or not, makes it all too... grey.

Some compilers implemented the strict-aliasing optimization (And did lots of other strange things), some did not.

C compilers existed in 1989 that would assume different structure types with common initial sequences couldn't alias? With the "common initial sequence" carve-out in §3.3.2.3? I am sceptical...

DSMan195276 · on Feb 16, 2017

I'm gonna try to address everything I can in this reply. Sorry it has taken so long.

> C compilers existed in 1989 that would assume different structure types with common initial sequences couldn't alias? With the "common initial sequence" carve-out in §3.3.2.3? I am sceptical...

That's not quite what I was talking about. What some compilers would do is not generate extra reads when you had a `long * ` and a `int * ` in the same scope - with the idea being that those two cannot point to the same data, and thus it is not necessary to reread the data from the `int * ` when you write to the `long * `. Compilers have now taken it a slight step further - an `int` that belongs to a `struct a` and an `int` that belongs to a `struct b` can't alias - but it is really not much different from the original idea (And hence why it is legal). What the standard really describes is that objects have a single defined type and that accessing objects though a type other then their original type is invalid, which fits with what compiler writers have taken to doing. That said, I would not be opposed to the standard simply making that legal. While avoidable in most cases, it does cause problems in some instances (BSD sockets being a very notable example), and I'd wager it only brings marginal optimizations (for which `restrict` already provides a solution).

> I do not think this is that easy when you include sockaddr_un into the mix, because of the way that sockaddr doesn't include the full size of its path member. This is, in fact, the point that I throw up my hands and use -fno-strict-aliasing because the fact that pointer provenance, rather than just value and type, is important together with the fact that it's not actually clear whether you've correctly laundered the pointer through a union or not, makes it all too... grey.

Technically, you could use a `char` array for the `sockaddr_un`, and then just cast it to the right type. That's legal because `char` can alias. That said I'm fairly sure that `sockaddr_un` has a defined size - it doesn't use a FAM in implementation, it's just that the length of it's path member can vary. The POSIX standard isn't as clear as can be, but notes that it is left undefined only for the reason that different Unix's use different max lengths, and it says that it's typically somewhere in the range of 92 to 108. That along with the typical usage of `sockaddr_un` implies to me that it is perfectly fine to declare one, it just doesn't have a guaranteed length. Used in a `union` it should be fine. (All that said, I think what you've said also shows another current issue with C - there should be a way to statically declare a `struct` that has a FAM at the end by providing the length for the FAM. There's no way to do this currently except using a `char` array and casting, which is not an acceptable solution IMO).

On that note though, the entire issue here could actually be largely resolved by simply adding the `may_alias` gcc attribute to the definition of `struct sockaddr` (And `struct sockaddr_storage`). It would declare that a `sockaddr` can alias any other type you want, and thus would make it legal to read from a `sockaddr` and then cast it to another type for use - removing the need for the `union` BS and all the other various hacks to get around this issue. Obviously that's not standard C, but I think it makes a pretty good argument that adding something like `may_alias` to the standard would be a very good addition.

And that touches only the larger problem with strict-aliasing that I see - there's no way to avoid it. We have `restrict` which ironically allows us to avoid the aliasing problem for pointers which strict-aliasing doesn't apply, but we have no way to tell the compiler two pointers (or types) can alias when it thinks they can't. `may_alias` is one solution, but really any solution that fixes would problem would be extremely welcome in my book. I think the standards writers currently consider `union` to be the solution, but IMO that's simply not sufficient.

> No, that isn't guaranteed by POSIX - it has to have an sa_family_t member, but it doesn't have to be the first one. > > I also think it's problematic that the use explicitly contemplated by POSIX is considered ill-formed C.

As long as all the sa_family_t members in all of the various `sockaddr` types overlap then you could make it work (If they don't overlap I fell like that would create lots of other issues). Obviously though this is a pretty clumsy solution.

And I would agree - I wouldn't say it's anybodies particular fault that we've hit this particular point (Though you could argue that compiler writers jumped the gun on this one), but it is an issue worth addressing. I do think it's possible to use it correctly through the usage of a few different techniques, but 1. most programs already written don't do that, and 2. like I said before, you shouldn't have to go through a million hoops (that aren't even mentioned) to use the interface correctly.

TorKlingberg · on Feb 14, 2017

A common class assignment or interview question is to write your own memcpy. Towards the end you usually start optimizing it by copying multiple bytes at once. That is undefined behavior. You cannot just cast a pointer to uint32_t* and start using it, unless the underlying object is actually uint32_t. In practice it works fine, so people don't care. We'll see what future compilers will do, especially when the homemade memcpy is inlined somewhere.

An other one is custom malloc backed by a static char array. You're allowed to access any object as char*, but not the other way around. A static char array is always a char array, and accessing it through a pointer to anything else is a strict aliasing violation. Only the built-in malloc and siblings can create untyped memory.

rwj · on Feb 14, 2017

You can cast from char* to uint32_t* and start using it. I forget the exact standardese, but there is an execption for char*.

TorKlingberg · on Feb 14, 2017

There is an exception for accessing any object through a character type pointer, but not the other way around. uint32_t is not a character type, and it doesn't matter if it was casted to a char* first.

Also, apparently uint8_t may not be a character type.

DSMan195276 · on Feb 14, 2017

> Also, apparently uint8_t may not be a character type.

I think that goes hand-in-hand with the fact that `char` is not guaranteed to be 8-bits wide in C, so by that fact `uint8_t` may not be a `char`. In practice I highly doubt this distinction really matters: Platforms without an 8-bit `char` are basically guaranteed to not support (a standards compliant) `uint8_t` anyway, and it is reasonable to target 8-bit `char` systems in which case it's safe to assume `uint8_t` and `char` are the same thing. (Well, `unsigned char`)

TorKlingberg · on Feb 14, 2017

Yes, a system with char >8 bits wouldn't have a uint8_t type at all. But, even when char and (u)int8_t are the same size, it may not be the same type. Compilers could go and apply strict aliasing for pointers to uint8_t. I'm at the limit of my C types knowledge here, so correct me if I'm just wrong.

yorwba · on Feb 14, 2017

Only if the pointer is correctly aligned for the uint32_t data type. Otherwise you might get problems with unaligned memory acesses. (Like when you get some data over the wire that is clearly just a memory dump of a C struct, so you just do a pointer cast. Boom, unaligned read.)

junk_disposal · on Feb 14, 2017

Honestly, Optimizing compilers will kill C.

It killed the one thing C was good at - simplicity (you know exactly what happens where, note I'm not saying speed, as C++ can be quite a bit faster than C).

Now, due to language lawyering, you can't just know C and your CPU, you have to know your compiler (and every iteration of it!). And if you slip somewhere, your security checks blow up (http://blog.regehr.org/archives/970 https://bugs.chromium.org/p/nativeclient/issues/detail?id=24...) .

msbarnett · on Feb 14, 2017

> Now, due to language lawyering, you can't just know C and your CPU, you have to know your compiler (and every iteration of it!).

This mythical time never existed. You always had to know your compiler -- C simply isn't well specified enough that you can accurately predict the meaning of many constructs without reference to the implementation you're using.

It used to, if anything, be much much worse, with different compilers on different platforms behaving drastically different.

vyodaiken · on Feb 14, 2017

This is not really correct. The kinds of implementation dependencies usually encountered reflected processor architecture. The C standards committee and compiler community have created a situation in which different levels of "optimization" can change the logical behavior of the code! Truly a ridiculous state of affairs. The standards committee has some mysterious idea I suppose, but the compiler writers who want to do program transformation should work on mathematica or prolog, not C.

prodigal_erik · on Feb 14, 2017

Compiler writers have to use program transformation to do well on benchmarks. Developers who don't prioritize benchmarks probably don't use C, and if they do they really shouldn't, because sacrificing correctness for speed is the only thing C is good for these days.

umanwizard · on Feb 14, 2017

Speed isn't the only reason to use C. I often use C not because it's fast, but because C is by far the simplest mainstream programming language. All these UB warts notwithstanding, no language's interface comes as close as C's to a basic, no-frills register machine.

red75prime · on Feb 14, 2017

You don't need speed, but you want to write in a language, which is closest to assembly? Hmm. Interesting view on what is simple.

umanwizard · on Feb 14, 2017

Maybe we mean different things by the word "simple". What language do you think is simpler than C?

adrianN · on Feb 14, 2017

Scheme, ML, Java, Lua...

umanwizard · on Feb 16, 2017

Yeah, we must have different definitions of simplicity, as I expected.

Just off the top of my head, Java has the following complexities that C lacks: exceptions, garbage collection, class inheritance, generics, and a very large standard library.

What definition of simplicity are you using when you say Java is simpler than C?

junk_disposal · on Feb 14, 2017

In reality, that's why I think that neither Go nor Rust will ever replace C (Maybe C++, Java or Python), because nothing can replace C.

C worked in the 70s, when a naive compiler + asm would work perfectly.

Hydraulix989 · on Feb 14, 2017

Why though? At the end of the day, both are pretty C-like.

vyodaiken · on Feb 14, 2017

Oh please.

dbaupp · on Feb 14, 2017

Optimisers are what made C what it is: they convert the idealised PDP-11 assembly into something efficient on modern computers, and speed is something C programmers care about.

ArkyBeagle · on Feb 14, 2017

In the large, no, they don't care about the 90-95% of the code base that's not performance critical. And these days, the stuff that is critical will be #ifdef and asm(...) stew.

I can't tell you how many projects I have been on where disabling optimization made no measurable difference in performance.

This being said, I cannot speak for game devs nor video device driver developers.

E6300 · on Feb 14, 2017

I have to say, I have never encountered a program where compiling without optimizations made no difference. If you have seen that, then I would agree that C was a very, very poor choice for that particular domain.

ArkyBeagle · on Feb 15, 2017

It made no measurable difference. The acceptance criteria for the system were not materially affected. Since the optimization did not improve the performance by that, it was generally set one way ( mostly off ) and let that way.

Teams I'm on have written some tight - but readable -code, too. Well-architected, low-latency, low-jitter.

pjmlp · on Feb 14, 2017

Business applications, as it was common in the 90's.

Witting them in C or VB made no difference for the guys sitting at the desk.

ArkyBeagle · on Feb 15, 2017

This ranged from networking equipment to phone switches to equipment control. Most of it was bare metal.

I've never written a business application in my life.

adrianN · on Feb 14, 2017

You can try an experiment and build an application like Firefox from source with disabled optimizations. I bet you'll notice a massive difference.

Even more important are things that run in datacenters on thousands and thousands of machines. Even if you suppose that optimizations make only a minuscule difference on the scale of today's infrastructure 5% fewer machines can save huge amounts of electricity.

ArkyBeagle · on Feb 15, 2017

Why would I want to build Firefox from source? I don't build things that run on thousands of machines. Last thing I ran any code on had a 2500 HP diesel; the electronics were in the noise.

tacostakohashi · on Feb 14, 2017

If you do know your compiler and your CPU (singular), you're probably not really programming C.

Conversely, if you maintain software that compiles on a bunch of compilers, operating systems and architectures (particularly little endian + big endian, 32 bit + 64 bit), then it's probably written in something rather like C. A lot of people do this.

ArkyBeagle · on Feb 14, 2017

Just use the parts that work unambiguously. It's a surprisingly small subset of the language.

baby · on Feb 14, 2017

undefined behavior has always been undefined behavior. Optimizing the compiler doesn't change that fact.

Tharre · on Feb 14, 2017

I don't think this Q&A format makes for a good case of not knowing C.

I mean I got all answers right without thinking about them too much, but would I too if I had to review hundreds of lines of someone else's code? What about if I'm tired?

It's easy to spot mistakes in isolated code pieces, especially if the question already tells you more or less what's wrong with it. But that doesn't mean you'll spot those mistakes in a real codebase (or even when you write such code yourself).

moosingin3space · on Feb 14, 2017

This is further compounded by how difficult it is to build useful abstractions in C, meaning that much real-world C consists of common patterns, and reviewers focus on recognizing common patterns, which increases the chances that small things slip through code review.

Agreed that these little examples aren't too difficult, especially if you have experience, but I certainly do not envy Linus Torvalds' job.

hermitdev · on Feb 14, 2017

It's worth noting that for example #12, the assert will only fire for debug builds (i.e. the macro NDEBUG is not defined). So, depending on how the source is compiled, it may be able to invoke the div function with b == 0.

eon1 · on Feb 14, 2017

C also: https://news.ycombinator.com/item?id=12902304

userbinator · on Feb 14, 2017

IMHO the problem is with compilers (and their developers) who think UB really means they can do anything, when what programmers usually expect is, and the standard even notes for one of the possible interpretations of UB, "behaving during translation or program execution in a documented manner characteristic of the environment".