I remember reading this (back in 2006?) and looking at the versions of Binary Se...

abecedarius · on Feb 17, 2010

Yes, I've usually represented the range with (low, size) instead of (low, high) (and then the high-low part never comes up).

So when this article first came out I was all "Aren't you bagging on Bentley too much? He published pseudocode, and of course you have to take care about things like overflow when converting it to C." But then I saw Bentley saying somewhere that no, he really hadn't considered this problem. (IIRC)

neilc · on Feb 17, 2010

Bentley didn't just publish pseudo-code; he also published C, and the published C contains the bug.

abecedarius · on Feb 17, 2010

Ah well, at least it fit the theme here when I misremembered.

nova · on Feb 17, 2010

regardless of how they are represented, are not things that should be added together.

Maybe something about torsors?

http://math.ucr.edu/home/baez/torsors.html

ggchappell · on Feb 18, 2010

Apparently.

(Nice article. Thanks for the link.)

stonemetal · on Feb 17, 2010

How does that avoid the problem? Assuming you used a signed integer for size and the list is large enough, there is a sign roll over on size. If size becomes less than -2 then low + size/2 is less than low (on the first pass of a binary search you now have an invalid iterator for mid). Assuming you used unsigned integers size can still not be the correct value if a rollover has occurred though this time mid will always be valid.

notaddicted · on Feb 17, 2010

If you have a byte array on a 32bit machine, assuming your program's text (i.e. instructions) takes up some space, the biggest possible array will be less than 2^32 Bytes = 4GiB, so the midpoint will be too. There can be no overflow.

If you subtracted two 64bit pointers and stored the result in an int32 though, then you would have problems.

ggchappell · on Feb 18, 2010

> Assuming you used a signed integer for size ....

By "integer", I did not mean "int". size was a std::size_t.

pmiller2 · on Feb 17, 2010

Perhaps even more generally, the range ADT should include a method to locate the midpoint of a range, in addition to the endpoints.