That sounds good in theory, but many things that are UB in C/C++ are UB because ...

jcranmer · on April 2, 2023

> many things that are UB in C/C++ are UB because they are really hard to verify at compile time which makes them almost impossible to program around

The second half of the sentence doesn't follow from the first. Take everyone's favorite example, signed integer overflow: all you have to do to avoid UB on signed integer overflow is check for overflow before doing the operation (and C23 finally adds features to do that for you).

Taking a step back, the fundamental thing about UB is that it is very nearly always a bug in your code (and this includes especially integer overflow!). Even if you gave well-defined semantics to UB, the semantics you'd give would very rarely make the program not buggy. Complaining that we can't prove programs free of UB is tantamount to complaining that we can't prove programs free of bugs.

It actually turns out that UB is actually extremely helpful for tools that try to help programmers find bugs in their code. Since UB is automatically a bug, any tool that finds UB knows that it found a bug; if you give it well-defined semantics instead, it's a lot trickier to assert that it's a bug. In a real-world example, the infamous buffer overflow vulnerability Heartbleed stymied most (all?) static analyzers for the simple reason that, due to how OpenSSL did memory management, it wasn't actually undefined behavior by C's definition. Unsigned integer overflow also falls into this bucket--it's very hard to distinguish between intentional cases of unsigned integer overflow (e.g., hashing algorithms) from unintentional cases (e.g., calculating buffer sizes).

the_why_of_y · on April 2, 2023

My complaint here is that it took C more than 30 years between defining signed integer overflow as UB and providing programmers with standard library facilities to check if a signed integer operation would result in overflow.

I much prefer Rust's approach to arithmetic, where overflow with plain arithmetic operators is defined as a bug, and panics on debug-enabled builds, plus special operations in the standard library like wrapping_add and saturating_add for the special cases where overflow is expected.

chongli · on April 2, 2023

My complaint here is that it took C more than 30 years ... I much prefer Rust's approach

That's an odd complaint. Rust didn't spring forth fully formed from the ether, it stands on the shoulders of C (and other giants of PL history). 30 years ago you couldn't use Rust at all because it didn't exist.

The reason the committee doesn't just radically change C in all these nice ways to catch up to Rust is because it would be incompatible. Then you wouldn't have fixed C, you'd just have two languages: "old C", which all of the existing C code in the world is written in, and "new C", which nothing is written in. At that point why not just start over from scratch, like they did with Rust?

the_why_of_y · on April 2, 2023

Interestingly, the first Ada standard in 1983 defined signed integer overflow to raise a CONSTRAINT_ERROR exception.

But apparently it lacked unsigned integers with modular arithmetic?

http://archive.adaic.com/standards/83lrm/html/lrm-11-01.html... http://archive.adaic.com/standards/83lrm/html/lrm-03-05.html

The 2012 version is a bit more readable, and has unsigned integers:

For a signed integer type, the exception Constraint_Error is raised by the execution of an operation that cannot deliver the correct result because it is outside the base range of the type. For any integer type, Constraint_Error is raised by the operators "/", "rem", and "mod" if the right operand is zero.

For a modular type, if the result of the execution of a predefined operator (see 4.5) is outside the base range of the type, the result is reduced modulo the modulus of the type to a value that is within the base range of the type.

http://www.ada-auth.org/standards/rm12_w_tc1/html/RM-3-5-4.h...

c4mpute · on April 2, 2023

> all you have to do to avoid UB on signed integer overflow is check for overflow before doing the operation

All you have to do is add a check for overflow _that the compiler will not throw away because "UB won't happen"_. The very thing you want to avoid makes avoiding it very hard, and lots of bugs have resulted from compilers "optimizing" away such overflow checks.

chongli · on April 2, 2023

This is covered in the article and numerous replies in this thread. Use <stdckdint.h>.

c4mpute · on April 5, 2023

stdckdint.h is only available in C23. The problem has existed before that and lead to tons of exploits and bugs.

xigoi · on April 2, 2023

> all you have to do to avoid UB on signed integer overflow is check for overflow before doing the operation (and C23 finally adds features to do that for you).

…making your code practically unreadable, since you have to write ckd_add(ckd_add(ckd_mul(a,a),ckd_mul(ckd_mul(2,a),b)),ckd_mul(b,b)) instead of a * a + 2 * a * b + b * b.

chongli · on April 2, 2023

That's not the correct syntax for the ckd_ operations. They take 3 operands, the first being a pointer to an integer where the result should be stored. And they return a bool, which you need to check in a conditional. If you're just going to throw out the bool and ignore the overflows, why bother with checked operations in the first place?

xigoi · on April 2, 2023

Yeah, I realize that now. That's even worse. So you'll have to write something like

    int aa,twoa,twoab,bb,aaplustwoab,aaplustwoabplusbb;
    if (ckd_mul(a,a,&aa)) { return error; }
    if (ckd_mul(2,a,&twoa)) { return error; }
    // …
    if (ckd_add(aaplustwoab,bb,aaplustwoabplusbb)) { return error; }
    return aaplustwoabplusbb;

So ergonomic!

> If you're just going to throw out the bool and ignore the overflows, why bother with checked operations in the first place?

I'd expect the functions to return the result on success and crash on failure. Or better, raise an exception, but C doesn't have exceptions…

chongli · on April 2, 2023

Why not just write:

    bool aplusb_sqr(int* c, int a, int b) {
        return c && ckd_add(c, a, b) && ckd_mul(c, *c, *c);
    }

xigoi · on April 2, 2023

Obviously you could do that in this case, I just wanted to come up with a complicated formula.

chongli · on April 2, 2023

See my other comment [1] which addresses the exact things you brought up here. Safe checked arithmetic is a new standard feature in C23. If no progress were not UB, then tons of loop optimizations would be impossible and then we couldn’t have nice things, like numpy.

[1] https://news.ycombinator.com/item?id=35406554

coliveira · on April 2, 2023

> Any signed addition in C is potential UB unless you have a proof that all numbers that will ever be input to the addition won't cause overflow

This has always been the case. Standard C has always operated with the possibility that addition can overflow. The programmer or library writer is responsible to check if the used types are large enough. If you want to be perfectly sure you need to check for overflow. Making this UB has not changed the nature of the issue.

> is made harder because C doesn't define the size of the default integer types

They correctly made this implementation defined. But C now has different byte sized integer types if you want to be sure.