Competitive programming in Haskell: better binary search

babel_ · on Jan 1, 2023

The "black magic" of binaryFloat is a known type pun to a sign-magnitude integer, so because it searched for positive values it always worked, albeit it took a few steps to converge, however for negative values it is not correctly handling it and will explode --- it also won't return NaN for sqrt of negative x, though it explodes well before that becomes a problem.

Negative l positive r works because the first binary step will halve it, thereby losing the top sign bit (being equivalent to a right shift), and basically restarting the search with an l of 0.0 (or rather some very small float near it). I recommend printing each step out with binary representation, to see the working.

The spec covers this binary shennanigans under comparison and total ordering, though it relies on some NaN behaviour being followed (754-2008 recommended's is_quiet flag behaviour). See Herf's old post for the proper bit twiddling https://web.archive.org/web/20220326204603/http://stereopsis... (wayback as their webserver is down).

(This is an expanded version of my own comment on the page, since it seems stuck in "awaiting moderation" limbo, just in case it shows up later while I'm not here.)

082349872349872 · on Jan 1, 2023

cf EWD1293: https://www.cs.utexas.edu/users/EWD/ewd12xx/EWD1293.PDF

(note that binary search is an example of computer scientists acknowledging the existence of practitioners: they won't go so far as to do a k-ary search, for that would require engineering a [necessarily ephemeral] suitable value of k, but they have demonstrated goodwill in stepping beyond the much-cheaper-to-prove 1-ary search.)

vanderZwan · on Jan 1, 2023

You're making me wonder if it would make sense to have k scale with r (where r the narrowed down range).

In practice binary search tends to perform better if we switch to linear search for small enough r, right? Kind of like how insertion sort is faster when r is less than 20-40 integers on modern hardware (this is assuming we're dealing with dense arrays in either scenario, obviously).

One could say that this is effectively switching from k = 2 to k = r at some threshold value. What if instead we increase k gradually as some proportion of r as the latter gets smaller, and sample in ascending order (Although I bet that even if it works, interpolation search is probably faster anyway)

[0] https://en.wikipedia.org/wiki/Interpolation_search

moonchild · on Jan 1, 2023

> we can also do exact binary search on the bit representations of floating-point numbers! That is, we do binary search as if the bit representations of l and r were unsigned integers. This is possibly more efficient than “continuous” binary search, and lets us find the two precisely adjacent floating-point numbers where our predicate switches from False to True.

Why is it better to use integer operations than with floating-point operations? Assuming everything is inlined and type-inferred away, float math should be faster, and just as precise.

One may argue that the convergence rate is less predictable; but this may be a feature, if the predicate under consideration is not quite monotonic (monotonicity is hard to guarantee when negative numbers are involved--though '\x -> x^2 >= 150' is straight up not monotonic, both mathematically and numerically).

> It even works when l is negative and r is positive (it seems like in that case the bit representation of l would correspond to a larger unsigned integer than r, but somehow it all works anyway!).

It's because of underflow:

  ghci> (unsafeCoerce (-100.0) :: Word) - (unsafeCoerce 100.0 :: Word)
  9223372036854775808

For Word, the condition 'r - l > 1' is true in all cases where l>r.

henrydark · on Jan 1, 2023

The predicate given is monotonic in the range 0.,100..

using the floating point bit representation (with appropriate starting l, r as they have) gives (a) best possible convergence rate - one bit per iteration. Using floating point math would also have this converge rate, but the halting condition would be interesting. Do you use the smallest epsilon? Alternatively, do you check that the midpoint is one of l,r?

Both approaches require you know a little something about floating point numbers, and I'd say it's mostly an esthetic choice, especially since the convergence rate is the same.

moonchild · on Jan 1, 2023

> the halting condition would be interesting

For l,r known to be positive: l*(1+(2^-52)) < r.

Substitute 2^-23 for single-prec floats.

> Using floating point math would also have this converge rate

As I said, it does not; using float math has a less predictable convergence rate. With float math, you'd halve the magnitude of the search space every iteration, whereas with integer math, you'd halve the number of floating-point numbers contained in the range.

henrydark · on Jan 1, 2023

Both methods give you a bit of information per iteration. The number of distinct floating point numbers between l,r is the same as the number of integers in the reinterpretation of l,r as integers.

(assuming starting l,r were appropriate)

moonchild · on Jan 1, 2023

Suppose l=1, r=3. So m=(1+3)/2=2. There are twice as many floating-point numbers between 1 and 2 as between 2 and 3. So if you refine from [1 3] to [1 2], you've only managed to prune 1/3 of your search space, not 1/2.

(Also: integer halving breaks when l and r have opposite signs.)

Perhaps this will make it clearer: suppose we have l=0 and r=1, and the number we're searching for is the smallest float, 2^-1074. If we use integer halving, we get one bit per iterations; floats have 64 bits, so we should require no more than 64 iterations. But if we use floating-point halving, obviously it will take 1074 iterations to get from [0 1] to [0 2^-1074].

henrydark · on Jan 1, 2023

Oh yeah, nice. Thanks, the example did make it clear.

tromp · on Jan 1, 2023

> For Word, the condition 'r - l > 1' is true in all cases except when r=l.

Simplest counterexample: (l,r) = (0,1)

> For Word, the condition 'r - l > 1' is true in all cases where l>r.

Simplest counterexample: (l,r) = (1,0), although one should always have l <= r.

moonchild · on Jan 1, 2023

> Simplest counterexample: (l,r) = (1,0)

Then r-l is 0-1, which underflows.

  ghci> (0 :: Word) - (1 :: Word)
  18446744073709551615

tromp · on Jan 1, 2023

> Then r-l is 0-1, which underflows.

True. Then (l,r) = (18446744073709551615,0) is a counterexample, since r-l = 1.

jonsen · on Jan 1, 2023

So is it not true that it is not true?

moonchild · on Jan 1, 2023

18446744073709551615 is greater than 1, isn't it?

jonsen · on Jan 1, 2023

Ah, of course, sorry!

lupire · on Jan 1, 2023

> > For Word, the condition 'r - l > 1' is true in all cases except when r=l.

Was this a typo that 'moonchild fixed?

tromp · on Jan 1, 2023