Hacker News new | past | comments | ask | show | jobs | submit login

To be pedantic, I think you're speaking about unspecified behavior and implementation defined behavior. Undefined behavior specifically refers to things that have no meaningful semantics, so the compiler assumes it never happens.

Unspecified behavior is anything outside the scope of observable behavior for which there are two or more ways the implementation can choose.

Since the timing of instructions on machines with speculative execution is not observable behavior in C, anything that impacts it is unspecified.

There's really no way around this, and I disagree that there's an "unreasonable" amount of it. Truly the problem is up to the judgement of the compiler developers what choice to make and for users to pick implementations based on those choices, or work around them as needed.




I am referring to undefined behavior.

For example, consider the case integer overflow when adding two signed numbers. C considers this undefined behavior, making the program's behavior is undefined. All bets are off, even if the program never makes use of the resulting value. C compilers are allowed to assume the overflow can never happen, which in some cases allows them to infer that numbers must fit within certain bounds, which allows them to do things like optimize away bounds checks written by the programmer.

A more reasonable language design choice would be to treat this as an operation that produces and unspecified integer result, or an implementation-defined result.

Edit: The following article helps clear up some common confusion about undefined behavior:

https://blog.regehr.org/archives/213

Unfortunately this article, like most on the subject, perpetuates the notion that there are significant performance benefits to treating simple things like integer overflow as UB. E.g.: "I've heard that certain tight loops speed up by 30%-50% ..." Where that is true, the compiler could still emit the optimized form of the loop without UB-based inference, but it would simply have to be guarded by a run-time check (outside of the loop) that would fall back to the slower code in the rare occasions when the assumptions do not hold.


Signed integer overflow being undefined has these two consequences for me: 1. It makes my code slightly faster. 2. It makes my code slightly smaller. 3. It makes my code easier to check for correctness, and thus makes it easier to write correct code.

Win, win, win.

Signed integer overflow would be a bug in my code.

As I do not write my own implementations to correctly handle the case of signed integer overflow, the code I am writing will behave in nonsensical ways in the presence of signed integer overflow, regardless of whether or not it is defined. Unless I'm debugging my code or running CI, in which case ubsan is enabled, and the signed overflow instantly traps to point to the problem.

Switching to UB-on-overflow in one of my Julia packages (via `llvmcall`) removed like 5% of branches. I do not want those branches to come back, and I definitely don't want code duplication where I have two copies of that code, one with and one without. The binary code bloat of that package is excessive enough as is.


Agreed. If anything, I'd like to have an unsigned type with undefined overflow so that I can get these benefits while also guaranteeing that the numbers are never negative where that doesn't make any sense.


That's what zig did, and they solved the overflow problem by having seperate operators for addition and subtraction that guarantee that the number saturates/wraps on overflow.


It would also be nice if hardware would trap on signed integer overflow. Of course since the most popular architectures do not, new architectures also do not either.


The point is much of what the C standard currently calls undefined behavior should instead be either unspecified or implementation-defined. This includes the controversial ones like strict aliasing and signed overflow.

Additionally, part of the problem is compiler devs insisting on code transforms that are unsound in the presence of undecidable UB, without giving the programmer sufficiently fine control over such transforms (at best we have a few command line flags for some of them, worst case you'd need to disable all optimizations including the non-problematic ones.)




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: