Hacker News new | past | comments | ask | show | jobs | submit login

Signed overflow being undefined behavior allows optimizations that wouldn't otherwise be possible

Quoting http://blog.llvm.org/2011/05/what-every-c-programmer-should-...

> This behavior enables certain classes of optimizations that are important for some code. For example, knowing that INT_MAX+1 is undefined allows optimizing "X+1 > X" to "true". Knowing the multiplication "cannot" overflow (because doing so would be undefined) allows optimizing "X*2/2" to "X". While these may seem trivial, these sorts of things are commonly exposed by inlining and macro expansion. A more important optimization that this allows is for "<=" loops like this:

> for (i = 0; i <= N; ++i) { ... }

> In this loop, the compiler can assume that the loop will iterate exactly N+1 times if "i" is undefined on overflow, which allows a broad range of loop optimizations to kick in. On the other hand, if the variable is defined to wrap around on overflow, then the compiler must assume that the loop is possibly infinite (which happens if N is INT_MAX) - which then disables these important loop optimizations. This particularly affects 64-bit platforms since so much code uses "int" as induction variables.




I've always thought that assuming such things should be wrong, because if you were writing the Asm manually, you would certainly think about it and NOT optimise unless you had a very good reason why it won't overflow. Likewise, I think that unless the compiler can prove that it, it should, like the sane human, refrain from making the assumption.


Well, by that reasoning, if you were coding in C, you would certainly think about it and ensure overflows won't happen.

The fact is that if the compiler encounters undefined behaviour, it can do basically whatever it wants and it will still be standard-compliant.


> for (i = 0; i <= N; ++i) { ... }

The worst thing is that people take it as acceptable that this loop is going to operate differently upon overflow (e.g. assume N is TYPE_MAX) depending on whether i or N are signed vs. unsigned.


Is this a real concern, beyond 'experts panel' esoteric discussion? Do folks really put a number into an int, that is sometimes going to need to be exactly TYPE_MAX but no larger?

I've gone a lifetime programming, and this kind of stuff never, ever matters one iota.


Yes, people really do care about overflow. Because it gets used in security checks, and if they don't understand the behavior then their security checks don't do what they expected.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475 shows someone going hyperbolic over the issue. The technical arguments favor the GCC maintainers. However I prefer the position of the person going hyperbolic.


That example was not 'overflow'. It was 'off by one'? That seems uninteresting, outside as you say the security issue where somebody might take advantage of it.


That example absolutely was overflow. The bug is, "assert(int+100 > int) optimized away".

GCC has the behavior that overflowing a signed integer gives you a negative one. But an if tests that TESTS for that is optimized away!

The reason is that overflow is undefined behavior, and therefore they are within their rights to do anything that they want. So they actually overflow in the fastest way possible, and optimize code on the assumption that overflow can't happen.

The fact that almost no programmers have a mental model of the language that reconciles these two facts is an excellent reason to say that very few programmers should write in C. Because the compiler really is out to get you.


Sure. Sorry, I was ambiguous. The earlier example of ++i in a loop I was thinking of. Anyway, yes, overflow for small ints is a real thing.


The very few times I've ever put in a check like that, I always do something like i < MAX_INT - 5 just to be sure, because I'm never confident that I intuitively understand off-by-one errors.


Same here. But I instead run a loop over a range around MAX_INT (or wherever the issue is) and print the result, so I know I'm doing what I think I'm doing. Exhaustive testing is quick, with a computer!


This isn't a good idea either: if you're dealing with undefined behavior the way the complier translates your code can change from version to version, so you could end up with a code that works with the current version of GCC but doesn't work on the next. Personally I don't agree with the way GCC and other comipers deal with UB, but this would be off topic.


Hm. May be off-topic by now. Incrementing an int is going to work the same on the same hardware, forever. Nothing the compile has any say in.


If a compiler decides that it's going to process:

    unsigned mul(unsigned short x, unsigned short y)
    { return x*y; }
in a way that causes calling code to behave in meaningless fashion if x would exceed INT_MAX/y [something gcc will sometimes actually do, by the way, with that exact function!], the hardware isn't going to have any say in that.


So in a corner case where you have a loop that iterates over all integer values (when does this ever happen?) you can optimize your loop. As a consequence, signed integer arithmetic is very difficult to write while avoiding UB, even for skilled practitioners. Do you think that's a useful trade-off, and do you think anything can be done for those of us who think it's not?


No, it's exactly the opposite. Without UB the compiler must assume that the corner case may arise at any time. Knowing it is UB we can assert `n+1 > n`, which without UB would be true for all `n` except INT_MAX. Standardising wrap-on-overflow would mean you can now handle that corner case safely, at the cost of missed optimisations on everything else.


I/we understand the optimization, and I'm sure you understand the problem it brings to common procedures such as DSP routines that multiply signed coefficients from e.g. video or audio bitstreams:

for (int i = 0; i < 64; i++) result[i] = inputA[i] * inputB[i];

If inputA[i] * inputB[i] overflowed, why are my credit card details at risk? The question is: can we come up with an alternate behaviour that incorporates both advantages of the i<=N optimization, as well as leave my credit card details safe if the multiplication in the inner loop overflowed? Is there a middle road?


Another problem is that there's no way to define it, because in that example the "proper" way to overflow is with saturating arithmetic, and in other cases the "proper" overflow is to wrap. Even on CPUs/DSPs that support saturating integer arithmetic in hardware, you either need to use vendor intrinsics or control the status registers yourself.


One could allow the overflow behavior to be specified, for example on the scope level. Idk, with a #pragma ? #pragma integer-overflow-saturate


I'd almost rather have a separate "ubsigned" type which has undefined behavior on overflow. By default, integers behave predictably. When people really need that extra 1% performance boost, they can use ubsigned just in the cases where it matters.


I don't know if I agree. Overflow is like uninitialized memory, it's a bug almost 100% of the time, and cases where it is tolerated or intended to occur are the exception.

I'd rather have a special type with defined behavior. That's actually what a lot of shops do anyways, and there are some niche compilers that support types with defined overflow (ADI's fractional types on their Blackfin tool chain, for example). It's just annoying to do in C, this is one of those cases where operator overloading in C++ is really beneficial.


> I don't know if I agree. Overflow is like uninitialized memory, it's a bug almost 100% of the time, and cases where it is tolerated or intended to occur are the exception.

Right, but I think the problem is that UB means literally anything can happen and be conformant to the spec. If you do an integer overflow, and as a result the program formats your hard drive, then it is acting within the C spec.

Now compiler writers don't usually format your hard drive when you trigger UB, but they often do things like remove input sanitation or other sorts of safety checks. It's one thing if as a result of overflow, the number in your variable isn't what you thought it was going to be. It's completely different if suddenly safety checks get tossed out the window.

When you handle unsanitized input in C on a security boundary, you must literally treat the compiler as a "lawful evil" accomplice to the attackers: you must assume that the compiler will follow the spec to the letter, but will look for any excuse to open up a gaping security hole. It's incredibly stressful if you know that fact, and incredibly dangerous if you don't.


> When you handle unsanitized input in C on a security boundary, you must literally treat the compiler as a "lawful evil" accomplice to the attackers: you must assume that the compiler will follow the spec to the letter, but will look for any excuse to open up a gaping security hole. It's incredibly stressful if you know that fact, and incredibly dangerous if you don't.

I'd say more chaotic evil, since the Standard has many goofy and unworkable corner cases, and no compiler tries to handle them all except, sometimes, by needlessly curtailing optimizations. Consider, for example:

    int x[2];
    int test(int *restrict a, int *b)
    {
      *a = 1;
      int *p = x+(a!=b);
      *p = 2;
      return *a;
    }
The way the Standard defines "based upon", if a and b are both equal to x, then p would be based upon a (since replacing a with a pointer to a copy of x would change the value of p). Some compilers that ignore "restrict" might generate code that accommodates the possibility that a and b might both equal x, but I doubt there are any that would generally try to optimize based on the restrict qualifier, but would hold off in this case.


Integer overflow is more than a 1% performance boost, as it lets you do a lot of things with loops.


I once did a stupid test using either a int or unsigned in a for loop variable the performance hit was about 1%. Problem is modern processors can walk, chew gum, and juggle all at the same time. Which tends to negate a lot of simplistic optimizations.

Compiler writers tend to assume the processor is a dumb machine. But modern ones aren't, they do a lot of resource allocation and optimization on the fly. And they do it in hardware in real time.


> modern processors can walk, chew gum, and juggle all at the same time

It's easier than it sounds. One of the major problems you usually run into when learning to juggle is that you throw the balls too far forward (their arc should be basically parallel to your shoulders, but it's easy to accidentally give them some forward momentum too), which pulls you forward to catch them. Being allowed to walk means that's OK.

(For the curious, there are three major problems you're likely to have when first learning to juggle:

1. I can throw the balls, but instead of catching them, I let them fall on the ground.

2. My balls keep colliding with one another in midair.

3. I keep throwing the balls too far forward.)

There's actually a niche hobby called "joggling" which, as the name implies, involves juggling while jogging.


> Compiler writers tend to assume the processor is a dumb machine.

A lot of C developers tend to assume the compiler is a dumb program ;) There are significant hoisting and vectorization optimizations that signed overflow can unlock, but they can't always be applied.


If C had real array types the compiler could do real optimizations instead of petty useless ones based on UB.


Fair, hence the push in many language for range-based for loops that can optimize much better.


Have you considered adding intrinsic functions for arithmetic operations that _do_ have defined behavior on overflow. Such as the overflowing_* functions in rust?


The semantics most programs need for overflow are to ensure that (1) overflow does not have intolerable side effects beyond yielding a likely-meaningless value, and (2) some programs may need to know whether an overflow might have produced an observably-arithmetically-incorrect result. A smart compiler for a well-designed language should in many cases be able to meet these requirements much more efficiently than it could rigidly process the aforementioned intrinsics.

A couple of easy optimizations, for example, that would be available to a smart compiler processing straightforwardly-written code to use automatic overflow checking, but not to one fed code that uses intrinsics:

1. If code computes x=yz, but then never uses the value of x, a compiler that notices that x is unused could infer that the computation could never be observed to produce an arithmetically-incorrect result, and thus there would be no need to check for overflow.

2. If code computes xy/z, and a compiler knows that y=z*2, the compiler could simplify the calculation to x+x, and would thus merely have to check for overflow in that addition. If code used intrinsics, the compiler would have to overflow check the multiplication, which on most platforms would be more expensive. If an implementation uses wrapping semantics, the cost would be even worse, since an implementation would have to perform an actual division to ensure "correct" behavior in the overflow case.

Having a language offer options for the aforementioned style of loose overflow checking would open up many avenues of optimization which would be unavailable in language that only over precise overflow checking or no overflow checking whatsoever.


oops, i meant the wrapping_* functions


If one wants a function that will compute xy/z when xy doesn't overflow, and yield some arbitrary value (but without other side-effects) when it does, wrapping functions will often be much slower than would be code that doesn't have to guarantee any particular value in case of overflow. If e.g. y is known to be 30 and z equal to 15, code using a wrapping multiply would need to be processed by multiplying the value by 30, computing a truncated the result, and dividing that by 15. If the program could use loosely-defined multiplication and division operators, however, the expression could be simplified to x+x.


I hadn’t understood the utility of undefined behaviour until reading this, thank you.


N is a variable. It might be INT_MAX so the compiler cannot optimise the loop for any value of N. Unless you make this UB.


No, the optimizations referred to include those that will make the program faster when N=100.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: