The problem with -2147483648 in C.

haberman · on Sept 8, 2011

This bit of asymmetry also leads to this interesting security implication: http://my.opera.com/taviso/blog/show.dml/639454

(In short, (INT_MIN / -1) can crash your program with an exception, since the result can't be represented in an integer. It's an easy way to crash a program that does integer math on untrusted input.

tptacek · on Sept 8, 2011

Head explodey. I've been writing C since 1994 and doing security since 1995 (and, to boot, I try to follow Tavis) and I did not know this. I love this industry.

psykotic · on Sept 8, 2011

Me too. I knew this was undefined in theory because C treats any kind of signed overflow as undefined behavior, but I'd never run into this specific issue. Trapping it is a denial-of-service risk but at least it won't subtly break downstream invariants like most 2's complement signed overflow bugs. Some architectures have signed add/sub instructions that will optionally trap on overflow. If you're going to trap, it seems like you should do it across the board, rather than reserve it for one arithmetic instruction as with IDIV. The x86 IDIV/IMUL and DIV/MUL instructions have a nice left-inverse pairing (the multiplies take two n-bit operands to a 2n-bit result, the divides take a 2n-bit operand and an n-bit operand to an n-bit result) but that symmetry breaks down when you work with fixed-size operands and results as is usually the case.

btmorex · on Sept 8, 2011

My current favorite interview question is write itoa in C, and actually this comes up in almost every solution. (generally, people want to set some flag for negative and then treat every input the same so... if (n < 0) {negative = true; n = -n;}).

Catching the INT_MIN case separates the good (got everything else right) from the "omg offer them the world" candidates.

psykotic · on Sept 8, 2011

It's interesting to see the range of proposed solutions to this issue. Here's a few from memory.

Koenig's C Traps and Pitfalls does the opposite of what you'd expect by making the recursion work with negative integers and reducing the positive case to that. That in turn exposes the issue of C having an implementation-dependent meaning for x / y and x % y when x is negative. But he works around that in the usual way.

This solution doesn't assume 2's complement, but it does assume that the negative range of integers is at least as large as the positive range. This works with 2's complement, 1's complement and sign-magnitude, so it's pretty damn bulletproof. It wouldn't work with a hypothetical representation where the positive range was larger than the negative range. Making this fully portable to any ANSI C conforming system is possible (work with either positive or negative integers depending on which has the larger range) but it involves the chicken-and-egg problem of portably determining whether INT_MIN or INT_MAX is numerically largest. Here's one way to do that (untested):

        div_t quotrem(int numer, int denom)
	{
	    div_t d;
	    d.quot = numer / denom;
	    d.rem = numer % denom;
	    if (d.rem < 0) { d.rem += denom; d.quot--; }
	    return d;        
	}

        int abscmp(int x, int y)
	{
	    if (x == 0 && y == 0) return 0;
	    if (x == 0) return y > 0 ? -1 : +1;
	    if (y == 0) return x < 0 ? -1 : +1;
	    div_t dx = quotrem(x, 2), dy = quotrem(y, 2);
	    int r = abscmp(dx.quot, dy.quot);
	    return r == 0 ? dx.rem - dy.rem : r;
	}

That is purely an academic exercise in theoretical portability wanking, of course.

Hanson and Fraser's A Retargetable C Compiler does the inner loop the usual way with positive integers but using unsigned arithmetic. They specifically handle the INT_MIN case with code along these lines:

        if (i >= 0) u = (unsigned) i;
	else if (i == INT_MIN) u = 1 + (unsigned) INT_MAX;
	else u = (unsigned) -i;

Thus it assumes a 2's complement representation.

Here's another solution I just thought of that should be portable to any ANSI C conformant platform. Write the usual divide-by-base-until-zero loop but use this function to determine the right digit to put into the string buffer:

        int digit(int x, int base)
	{
		div_t d = quotrem(x, base);
		return x >= 0 ? d.rem : base - d.rem;
	}

This works for either positive or negative integers. That's the key. You don't need to normalize anything to either positive or negative values in your inner loop.

caf · on Sept 8, 2011

Hanson and Fraser's example can be improved:

  if (i >= 0)
      u = (unsigned)i;
  else
      u = -(unsigned)i;

This doesn't assume 2's complement - it works based on C's defined rules for wrapping arithmetic in unsigned types (out-of-range results are brought into range by repeatedly adding or subtracting UINT_MAX + 1).

psykotic · on Sept 8, 2011

Yeah, I had a conversation a few weeks ago with a coworker where he misremembered that as being what was in the lcc book. I actually posted about this here (including an excerpt from Steele and Harbinson) ten days ago: http://news.ycombinator.com/item?id=2935583. I notice my description there was slightly bungled since I forgot that after the cast for negative integers you of course have to do an unsigned negation (otherwise -1 would be mapped to UINT_MAX, for example).

m0nastic · on Sept 8, 2011

Neat,

K&R's itoa implementation example (page 64, in case anyone is curious) specifically calls out that it doesn't handle INT_MIN, and explaining why and fixing it is one of the exercises.

timsally · on Sept 8, 2011

How often are you interviewing C programmers for project work? Genuinely curious.

ajross · on Sept 8, 2011

Um... what? That produces no exception on any architecture I know. In 2's complement, -INT_MIN == INT_MIN. It certainly can be represented in an integer, and it's perfectly legal to return -2147483648 from that division, and the machine in front of me certainly does.

Integer math is done modulo 2^n (with a -2^(n-1) offset for signed operations) on modern hardware. Overflow is part of the design. There are no exceptions defined for results that "can't be represented". Divide by zero aborts because the underlying mathematical concept doesn't exist; that's different from a mere overflow.

Obviously there can be bugs that can result from an overflow, but overflow bugs are hardly limited to INT_MIN.

haberman · on Sept 8, 2011

> Um... what? That produces no exception on any architecture I know.

Have you tried actually compiling and running Tavis's test program with the given arguments? I did and it crashed on my Core i7 running on OS X.

In gdb I get:

    (gdb) run -2147483648 -1
    Starting program: /private/tmp/test3 -2147483648 -1
    Reading symbols for shared libraries +. done

    Program received signal EXC_ARITHMETIC, Arithmetic exception.
    0x0000000100000f23 in main ()

Hint: it won't work to just compile a program that does INT_MIN / -1 at compile-time, the compiler will do constant-folding that prevents an actual "idiv" instruction from being generated.

ajross · on Sept 8, 2011

Ooh, my bad. Though I was actually smarter than that and made gcc (4.6.0, amd64) do the division. The problem is that "int a=0x80000000; printf("%d\n", a/-1);" when compiled without (!) optimization generates a NEG instruction, not an IDIV.

That's not constant folding, that's strength reduction. And it's an optimization that I specifically told gcc not to do. Honestly I'd consider this a compiler bug. And of course NEG does what I expected and produces the 2's complementation negation of 0x8000000 (which is 0x80000000).

mturmon · on Sept 8, 2011

On PPC it does not produce an exception. Inspection of the assembler reveals, in part:

        cmpwi cr7,r0,0
        beq cr7,L5
        lwz r2,60(r30)
        lwz r0,56(r30)
        divw r2,r2,r0   ; ordinary integer divide
        stw r2,72(r30)  ; store the quotient
        b L7
  L5:
        li r0,0         ; else load 0
        stw r0,72(r30)  ; store the 0

The following thread has more about the reason PPC does not raise exceptions for these overflows:

http://fixunix.com/embedded/341544-divide-zero-does-not-cras...

But they're only thinking about division by 0, not by -1.

barrkel · on Sept 8, 2011

Intel x86 spec for idiv says Interrupt 0 is triggered if the quotient is too large to fit in the designated register. That would seem to be the case here.

eggnet · on Sept 8, 2011

No, in two's complement:

int_max + 1 == -int_min

ajross · on Sept 8, 2011

Two's complement negation of N is defined as ~N+1 (try it). ~INT_MIN == ~0x80000000 == 0x7fffffff. 0x7fffffff+1 = 0x80000000.

eggnet · on Sept 19, 2011

In your last step there you are literally adding one to INT_MAX, which is an overflow. Hence the exception.

ScottBurson · on Sept 8, 2011

I pity all you benighted knaves who still labor with bounded-precision integers. Lisp has tried to show you the True Way for decades now, but you have not listened! Unbounded-precision integers would free you from your chains!

(Obviously tongue-in-cheek... mostly... :-)

marshray · on Sept 8, 2011

If Giuseppe Peano had not handed down to us unbounded precision integers, it would have been necessary for John McCarthy to invent them.

Why?

To count all those freakin parentheses! HAHAHAHAHAA!

adobriyan · on Sept 8, 2011

It isn't bounded-precision per se that is the problem. Bounded precision by default is.

Add absence of unbounded precision integers in a language and non-fixed sizes of existing integer types and now it is a proper problem.

zokier · on Sept 8, 2011

Some[1] have listened.

[1] http://gmplib.org/