Math library functions that seem unnecessary

RiderOfGiraffes · on June 7, 2010

When computing various values related to navigation you sometimes need 1-cos(a) for very small values of a. If a is very close to zero then cos(a) is very close to 1, and as a varies (very sensibly) over modest ranges, the value of 1-cos(a) as computed by the naive expression bounces all over the place.

So one needs to detect when a is small and use 1-cos(a) ~ a^2/2-a^4/24

It makes a difference in some safety critical software I've written, and was removed despite the comment by a subsequent maintenance programmer. I was then contracted to fix the major problem that ensued.

A similar problem is when using the naive solution to the quadractic equation. If you have ax^2+bx+c=0 and you use x=(-b+/-sqrt(b^2-4ac))/(2a) then you run into stability problems if ac is very small and "a" is small. Then the sqrt is close to b, so for one choice on the numerator you get the subtraction of nearly equal values, giving rubbish. Then you divide by a small number and get big rubbish.

The solution in this case is to get the root where the numerator is close to 2b, and derive the second root from the first. More details if you want, but after seeing why it's hard, it's no longer interesting.

marchdown · on June 7, 2010

As far as I understand, mathematically von Neumann's computers simply use some specific algebra that is decidedly different from naïve arithmetic. It is actually very useful in certain applications, such as cryptography, that rely on its algebraic properties. It may or may not be good approximation for some numeric problems, but it is generally faster then working out elaborate mathematical structures.

But I'm only an ignorant student, and I wonder if these solutions are really significantly more effective than implementing the number tower or switching to bignums? What are the best practices and the consensus among professionals that have to deal with inadequacies of floating-point number in real-world numeric applications?

petewarden · on June 7, 2010

I'm nowhere near as mathematically savvy as RoG, but I do have some practical experience from dealing with game physics simulations. A classic problem with naive collision code is objects popping through each other occasionally, and a common 'fix' is to switch from float to double precision. This reduces the frequency of the errors, but that's seldom good enough.

Plane intersections can be performed completely accurately using bignum code, since there's no irrational numbers to creep in, but the performance penalty is far too severe for everyday use in games (at least it was a decade ago when I was last in the industry). Instead, the best approach was to carefully re-design the algorithms so that they're tolerant to small errors. For example by applying a repulsion force based on how deep inside a surface a polygon is, rather than trying to capture (and possibly missing) the exact point of an intersection, or adding a small 'thickness' to all polygons.

I've found this to be a useful approach in all my subsequent work in both graphics and statistical analysis. I start off using floats for all my real number storage, and if I see precision-related problems, I have a long hard think about how to redesign the algorithm to be more robust, rather than just automatically throwing more bits at it.

RiderOfGiraffes · on June 7, 2010

Bignums or other constructs are good answers sometimes, and not others. They can help you avoid various bits of piddling about, but at the expense of having an entirely different structure that subsequent maintainers might not be familiar with.

On the other hand, the standard floating point system has been in sue for a long time, has well-developed libraries, and a large body of material devoted to understanding when you need to be careful, and when you don't.

In short, calculations of this type need understanding, either of the floating point domain or of an alternative domain, but certainly of the existence of the issues and problems.

lmkg · on June 7, 2010

First of all, the problems with fixnums vs bignums are very different from the problems with floating point precision. Fixnums are a slightly different mathematical structure from bignums, but they're both commutative groups, and are identical for most inputs. Floating point arithmetic can basically be treated as non-deterministic, which is an entirely different kettle of fish (RoG can probably elaborate better than me how this characterization is and is not valid).

If you're using a naive algorithm for numeric computing, you can get some really atrocious numerical stability issues. If you happen to have exact representations of your numbers, precision isn't an issue, but it's not common that exact representations are even possible. A lot of common numerical operations result in irrationals, including all root, logarithmic, exponential, and trigonometric functions, and that would take an infinite amount of space to represent faithfully. So really, you're always stuck with deciding how much is "enough" for whatever you're doing.

The amount of space that is "enough" for a naive algorithm varies wildly depending on the particular naive algorithm, and some of them are disgustingly high. So, in order to have your naive algorithm output an acceptable answer, you've have to do a lot of numerical stability analysis. But, a similar amount of numerical stability analysis can usually give you a different algorithm that can get "enough" precision with the standard sizes of floats. Using standard floats gives all sorts of other nice benefits like being supported in modern languages, better speed and space performance, and hardware acceleration.

Additionally, most of the common pitfalls of floats are well-known by now, and have well-known solutions, and there are a number of variants and libraries for most 'workhorse' algorithms that trade between speed, space, and numerical stability. It comes down to being another case of knowing your tools.

(PS--Numerical stability doesn't have to do with von Neumann architecture, only with representation of numeric values)

jheriko · on June 7, 2010

"A lot of common numerical operations result in irrationals, including all root, logarithmic, exponential, and trigonometric functions, and that would take an infinite amount of space to represent faithfully."

Actually symbolic representation can help here - not just to represent the values, but help the final precision. As a trivial example 2/sqrt(2) can be reduced to sqrt(2) with no loss of precision from the division.

This is a lot of hassle though, and not a perfect solution, but it helps solve rounding errors from irrational constants or functions... in a sense some of your intermediary values have infinite precision (but in reality you removed the operation with a clever optimisation/precision conserving trick).

eru · on June 7, 2010

> (PS--Numerical stability doesn't have to do with von Neumann architecture, only with representation of numeric values)

Which you may view as part of the von Neumann architecture. But only if you compare it to something radically different, like analogue computers.

huherto · on June 7, 2010

In a related topic. It is surprising that many business applications use floating values to represent currency. That is very wrong, because it creates rounding errors that accumulate. A common symptom is if you have an invoice where the total cannot be calculated from the detail using a regular hand held calculator.

How should you handle currency? At least in Java there is a BigDecimal that should be used instead. Look at the divide() method to see how you can have different rounding behaviors. As an alternative, some people use integers with a convention that there is a decimal point in the nth position. Just remember that floating point shouldn't be used for currency.

Natsu · on June 7, 2010

> How should you handle currency?

Cents, or fractions thereof, fit into an integer, unless you have to do currency conversion.

If you have to do that, you hire a programmer who has studied numerical analysis. It's a fun course, but very hard. I remember studying with a textbook the professor himself wrote and not being able to find much information online. This has changed quite a bit in the past decade or so. Nowadays, you can find useful stuff like this online:

http://en.wikibooks.org/wiki/Real_Analysis http://floating-point-gui.de/ http://docs.sun.com/source/806-3568/ncg_goldberg.html

pkaler · on June 7, 2010

This is stuff you should have learned in school in a numerical methods or numerical analysis class. If you didn't, take a read of What Every Computer Scientist Should Know About Floating-Point Arithmetic.

http://docs.sun.com/source/806-3568/ncg_goldberg.html

lutorm · on June 7, 2010

I saw the title and immediately thought: that's going to be about expm1 et al... Anyone who's done numerical analysis connected to Blackbody functions or mean-free-path calculations will have learned this the hard way, like I did. ;-)

copper · on June 8, 2010

I think the authors point was that you shouldn't have to learn these things the hard way :)

rmundo · on June 7, 2010

Numerical Methods is one of the few classes I've taken in college that has direct application to what I'm doing now. Chances are anyone majoring in a science/engineering field would find it useful later on.

someone_here · on June 7, 2010

Why is the "random()" function in libraries always not very random? Why do I always have to use some other function for more random stuff? Shouldn't the poor implimentation be called "fast_random()" or something similar?

sgift · on June 7, 2010

Principle of least surprise.

If programmers want a "good enough" random value they will try random() first. If you really have a need for a better random value (e.g. usable for strong cryptography) you can be assumed to already know that random() is not good enough and can be trusted to search for a better substitute.

gjm11 · on June 7, 2010

> If you really have a need for a better random value ... you can be assumed to already know ... and can be trusted to search for a better substitute.

Oh, if only that were true.

"If all scientific papers whose results are in doubt because of bad rand()s were to disappear from library shelves, there would be a gap on each shelf about as big as your fist." (From Numerical Recipes, which ironically has given at least its share of bad advice on random number generation, though the latest edition isn't too bad. Actually, the quotation is from an earlier edition because the latest edition puts it more accurately but less concisely.)

Well ... strictly speaking it's perfectly true that you can be assumed to know, and can be trusted to search for something better. The trouble that the assumption and the trust would be very unwise.

zokier · on June 7, 2010

I'd say that random() producing good pseudorandom numbers is less surprising than not.

If programmers want a random number they will try random() first. If they find it's performance inadequate, they can then search for faster alternatives.

sgift · on June 7, 2010

In theory you are correct, in practice it is more complicated. There is no such thing as a good pseudo-random number (good meaning good for all use cases), only good enough. For random() to be useful it has to be generate "good" number for the average programming task, but must be fast enough that the average task can use it. Both properties are in conflict and so programming language implementors have to balance them out.

mturmon · on June 7, 2010

I disagree with this. For non-cryptographic purposes, the Mersenne Twister is very pseudorandom, has a huge period, and is competitively fast. There are also related shift-register techniques that have these properties.

The state of the art in PRNG's is still advancing, but it takes a fairly esoteric example to find a blind spot of the MT (if initialized properly, as wrappers should do). Way outside the "generate a random 512x512 bitmap" variety.

jacobolus · on June 7, 2010

Which is why Python’s random.random() (and friends) all just use the Mersenne Twister, and programmers can happily use it for their Monte Carlo simulations (or whatever), blissfully ignorant of the details of PRNGs.

JulianMorrison · on June 7, 2010

It's plenty random enough for non-cryptographic uses of randomness, like shuffling a music player.

lutorm · on June 7, 2010

It's not clear that it's random enough for Monte Carlo simulations, non-cryptographic though they may be. IIRC the standard random() has a pretty dysmal sequence length.

barrkel · on June 7, 2010

Most standard random routines have a period of 2^32 or 31. What they don't usually have is random low order bits. They're often linear congruential generators, just a multiply and an add.

someone_here · on June 7, 2010

http://cod.ifies.com/2008/05/php-rand01-on-windows-openssl-r...

No.

qwzybug · on June 7, 2010

Not quite what the article is talking about, but one of my greatest "aha!" moments in math/code was when I realized how handy the seemingly redundant atan2 function is.

orborde · on June 7, 2010

That is precisely what the article is talking about: math library functions that seem at first glance to be superfluous, until you happen to need exactly the thing they are designed to do.

RiderOfGiraffes · on June 7, 2010

It's much, much more than that - these are math library functions that do what you want, and are such that the obvious implementations using the obvious math functions are dangerously unstable or inaccurate.

jacobolus · on June 7, 2010

Things would be somewhat more obvious still if we had one function which returned [sin(θ), cos(θ)] (called say, tan2) and another function which did atan2. To be honest, there are undoubtedly more intuitive names for these, but whatever.

Then atan2(tan2(θ)) = θ, and tan2(atan2(y, x)) = [y, x]

I've long thought that the traditional/current form of our trig functions makes it harder for people to learn and use than they should be. Sadly, basic and widespread abstractions are almost impossible to change.

jheriko · on June 7, 2010

Well its similar in the sense that you lose information that you might sometimes need by doing atan(y/x) to find an angle. The difference is that its about range rather than precision...

jheriko · on June 7, 2010

Maybe this is why Intel give us F2XM1 (2^x - 1) as an opcode instead of, what would seem to be more useful, 2^x? Still - it requires a lot of thought from the programmer to realise he might gain some precision by removing pows and exps and splitting them up to avoid precision loss - and even more to make the jump to assembler in order to do it.

Daniel_Newby · on June 7, 2010

Another reason is that the operation may already be needed as a component of a trigonometric function, and it doesn't cost much to promote it to a separate opcode.

jheriko · on June 10, 2010

There is no 2^x opcode btw...

dlsspy · on June 7, 2010

I was assuming I'd find this:

http://en.wikipedia.org/wiki/C%2B%2B_Technical_Report_1#Math...

Parts of C++ that don't seem to be quite generally useful enough to warrant inclusion in a standard library.

nkassis · on June 7, 2010

why do they all define pi when it's just 22/7?

just kidding ;p