Hacker News new | past | comments | ask | show | jobs | submit login

The mentality of compiler authors writing these kinds of optimizations needs examination. Specifically, inferring value constraints based on value use, and eliminating branches on that basis, is suspect for a very simple reason: if you accept as an assumption that value constraints can be inferred from code, using undefined behaviour to eliminate a branch needs to be balanced by the fact that an alternate branch exists, and presumably the user wrote it for a reason. You can't have your cake and eat it.

Technically correct is not a useful standard. Working code is. If compilers make code like this fail, it should fail very loudly, not quietly; silently eliminating null checks is not a virtue unless it is absolutely known that the null pointer use would definitely blow up otherwise. Which is not always guaranteed in practice, as the memset experiment here shows.

If compiler authors really want to pursue this optimization, elimination of dead code by means of detection of undefined code should result in a compilation error, because the code intent is self-contradictory.




It's not that simple, because of inlining and macros.

Those optimizations can be used to quickly throw out unnecessary code like null-pointer checks inside inlined functions, so they are valuable and good to have.

So it's a matter of how much energy you spend on diagnostics, which ends up a rather heuristic field and perhaps we'd just be better off focusing on better static analysis tools that are separate from or can otherwise be decoupled from compilers.


Are we better served by fast and correct code, or faster and wrong?

The gain from application of an optimization needs to be compared with the time and cost of bugs introduced by the same optimization. And dead code elimination isn't necessarily a huge win - if you're fairly sure code is dead, you can make it cheap at the cost of being more expensive should it actually be alive (see my other comment).


Unnecessary code is one thing, exploiting the presence of undefined behavior to compile something that the programmer likely didn't want is another. It doesn't matter much if it's fast if it's wrong. A good compiler shouldn't try to read the mind of the programmer, it should complain loudly and stop. It should be free to do so, given tha leeway it has with undefined behavior.

And of course, the standards committee should get off their asses and define some behaviors.


One could at least throw an error / warning in "obvious" cases, like this.


Telling apart which dead code the programmer thought was meaningful and what is the result of a macro or inlined function is difficult. Using functions and macros very often causes dead branches to be added, and optimizing out dead branches is a huge performance win. The fact that some already-incorrect code incorrectly behaves after these optimizations are applied is not a good reason to stop doing an entire class of standards-compatible optimizations.


There's deducing the range of a variable due to constant expression evaluation and control flow, and then there's deducing its range because of undefined behaviour.

I believe this analysis is possible; taint the control flow analysis with a flag. Macro expansion and inlined functions are also not intractable; debug information requires tracking token location information all the way through to generated code, so this information can be used to apply a heuristic.

Dead branches don't need to cost that much, BTW. Put the code for disfavoured basic blocks out of line with other code (so it doesn't burn up code cache), and use hinted conditional jumps to enter them (e.g. on x86, forward conditional jumps are predicted not to be taken).


That's assuming that apparently dead branches being removed from macros and inlined functions will not cause issues with incorrect code relying on undefined behavior.


What about by default throwing an error or warning only in cases where it is "obvious"? (i.e. only in cases where the check was explicitly stated, so no macros / function boundaries / etc)

Ditto, if you have an explicit block (i.e. no macros / etc) that always causes undefined behavior, warn or throw an error by default.


Yeah, lets add quirks mode to C because it wasn't a minefield enough already!




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: