Fast division by multiplication

unwind · on Feb 17, 2010

Stuff like this is what I think of when I hear "compiler writers are smarter than you, don't micro-optimize".

It's amazingly much better to have the compiler hide this, than for application-level programmers to figure out the trick by themselves, and obfuscate their code accordingly. Sweet.

pmjordan · on Feb 17, 2010

An oldie but goodie. In practice, compilers are pretty good at working this out for you, provided the denominator is a constant expression. In other words, if you're burning cycles because of a division with variable but rarely-changing denominator in an inner loop, it can make sense to precompute the equivalent multiplication factor and perform the multiplication explicitly.

drallison · on Feb 17, 2010

The article is not available at the moment so I am flying a bit blind here.

H. Warren, Hackers Delight has a nice discussion of this problem. One of my favorite books, but I like to do code generators.

http://www.amazon.com/Hackers-Delight-Henry-S-Warren/dp/0201...

arnorhs · on Feb 17, 2010

I wish I could understand that. The math and all. If I could, I would be such an awesome programmer.

And incredibly smart.

Hat's off for this guy at least.

pmjordan · on Feb 17, 2010

It's not exactly rocket science, here's a simplified explanation (minus some subtleties):

You're trying to calculate n/d (and the result should be rounded off to the nearest integer[1]). Division is expensive, but you observe that you can divide by a power of 2 cheaply using a shift instruction, so you decide to transform the calculation into something of the form (n * d')/(2^N), which is a multiply and a shift-right instruction. For the two forms to be equivalent, d' = 2^N / d, which you precompute.

Certain values of N can be more convenient from a computational point of view - for example, multiplying 2 32-bit integers yields a 64-bit result; on most architectures, the high and low halves either land in separate 32-bit registers, or you have to calculate each half with a separate instruction. So if you choose N=32, you don't even need a shift instruction: you simply drop the low bits.

You also need to make sure you choose your d' such that your rounding will be correct for all values of n.

Depending on the exact value of d', it may also be possible to decompose the multiplication further, leaving only shifts and additions/subtractions.

[1] Of course, for floating-point numbers, you can just calculate 1/d, save it and multiply by it any time you need to divide by d.

pw0ncakes · on Feb 17, 2010

Generally, binary math becomes easier to understand if you play with a decimal analogue.

Let's say that our integer domain is 0 to 9999 (decimal) but you have enough space to handle up to 9 digits. To integer-divide something by 3, you can multiply by 33334 and shift-right 5 digits (i.e. drop the last 5):

  7939 * 33334 = 264638626;
  2646 (right answer) after dropping 5 digits.

Note that 33334/100000 is close enough to 1/3 that, given our finite domain (as with [0..9999], or [0..2^31-1]), the error is too small to matter.

For some divisors, we're better off. For example, if n = 25 and we're in base 10, you can multiply by 4 and drop 2 digits, as 4/100 = 1/25.

Same principle applies to binary.

lemming · on Feb 17, 2010

As an aside, Ridiculous Fish, apart from the excellent name, is one of the best blogs I know - interesting topics, good scientific method proving his conclusions, and a great writing style. I wish he posted more.

euroclydon · on Feb 17, 2010

You may not wish he posted more. I'll bet his infrequent posting is a prime cause of all the great things you say about the blog.