Hacker News new | past | comments | ask | show | jobs | submit login

I wonder if there are a lot of edge cases (large numbers that aren't easily decomposed into shifts/additions) where the performance decreases a lot. If these optimizations are cheap, why aren't they used everywhere?



No, actually the performance is good for almost all divisors. I had the same thought you did, so I got the SVN and ran the benchmark. If we are try trust their timing, on a Core i7 the fast divisions were almost all around 4x the speed of the native. There were surprisingly small differences between the 2^n numbers and the harder ones --- the magic of long pipelines, I guess.

These optimizations aren't used much because they are specific to the divisor used. For a single division, it would be faster just to let the silicon do the work. They only help when you are repeatedly doing division by the same denominator, which is rarely a bottleneck. But when it is (repeated integer divisions but not hardcoded), this looks like a fine solution!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: