Hacker News new | past | comments | ask | show | jobs | submit login

> Note that in python there is a math.fsum function which return a perfectly rounded summation of floats.

Quirks of the underlying data can yield surprising results.

I was playing around with the "well-known" floating point error result of 0.3 - 0.2 - 0.1 = some small number (instead of 0), and found:

    >>> sum((0.3, -0.2, -0.1) * 1_000_000) / 3e6
    -2.775557561562891e-23
    >>> math.fsum((0.3, -0.2, -0.1) * 1_000_000) / 3e6
    -9.25185853854297e-18
    >>> stable_mean((0.3, -0.2, -0.1) * 1_000_000)
    -1.387894586067203e-17
Namely, the "naive" mean yielded the smallest error (due to cancellation lining up just right such that it's capped after two rounds), fsum yielded a nominally larger one, and stable_mean yielded the biggest error.

Of course, if you sort the values (such that we're accumulating -0.2 + ... -0.2 + -0.1 + ... + -0.1 + 0.3 + ... + 0.3, then sum performs much worse, stable_mean is a little worse, and fsum performs exactly the same.

    >>> sum(sorted((0.3, -0.2, -0.1) * 1_000_000)) / 3e6
    -1.042215244884126e-12
    >>> math.fsum(sorted((0.3, -0.2, -0.1) * 1_000_000)) / 3e6
    -9.25185853854297e-18
    >>> stable_mean(sorted((0.3, -0.2, -0.1) * 1_000_000))
    -3.0046468865706775e-16
edit: Naturally, if you want accurate decimal calculations, you should consider using decimal.Decimal or fractions.Fraction (albeit at the cost of significant performance). Using numpy.float128 will help, but it's still subject to rounding errors due to imprecise representation.



Some of these results are indeed very surprising. The "correct" result of (double(0.3) - double(0.2) - double(0.1)) is -1 * 2^-54 = -5.5511151231257827e-17, and dividing by three comes out to -1.8503717077085942e-17.

Hence, as far as those functions are concerned, the naive mean yielded the worst error in both tests, but stable_mean coincidentally happens to be the best in the first test. I'm slightly surprised that none of them got the correct result.


I think you might be off by one for the "correct" result (i.e. 2^-55), which is what fsum arrives at.

    >>> 0.3 - 0.2 - 0.1
    -2.7755575615628914e-17
    >>> (0.3 - 0.2 - 0.1) / 3
    -9.25185853854297e-18
But either way, the fundamental problem in this case (as you noted) is actually due to loss of precision from numeric representation.


Heh, I reasoned about the exponent by hand and got -55 but then wasn't sure whether it was correct so I searched for discussions of 0.3 - 0.2 - 0.1 and took the value they quoted (2^-54). That being an off-by-one error does resolve my question.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: