> When I format a float out to 5 decimal places, I'm sort of making a statement ...

bunderbunder · on June 14, 2019

Fair enough.

One thing along similar lines that I've always hated to see is when the floating-point type in a programming language is called "real".

tomxor · on June 14, 2019

haha, yes it's a terrible name, such arrogance to suggest fitting infinite things into a very finite thing. In fact they couldn't even call it rational, because even after the precision limitation it's a subset of rationals (where representation error comes from due to base)... Inigo Quilez cam up with a really interesting way around this limitation where numbers are encoded with a numerator and denominator, he called "floating bar", this essentially does not have representation error, but it will likely hit precision errors sooner or at least in a different way (it's kinda difficult to compare directly).

bunderbunder · on June 14, 2019

Yeah, that is more what I'm on about. I can accept that a computer's `int` type is not an infinite set, even if it does cause problems at the boundaries. It feels like more of a little white lie to me.

Whereas, even between their minimum and maximum values, and even subject to their numeric precision, there are still so many rational numbers that an IEEE float can't represent. So it's not even a rational. Nor can it represent a single irrational number, thereby failing to capture one iota of what qualitatively distinguishes the real numbers. . . if Hacker News supported gifs, I'd be inserting one that features Mandy Patinkin right here.

tomxor · on June 14, 2019

Yeah, I think this is the single aspect that everyone finds unintuitive, everyone can understand it not having infinite precision or magnitude. It's probably a very reasonable design choice if we could just know the reasoning behind it, I assume it will be mostly about practicality of performance and implementing the operators that have to work with the encoding.

romwell · on June 15, 2019

>Inigo Quilez cam up with a really interesting way around this limitation where numbers are encoded with a numerator and denominator, he called floating bar

Thanks for the read! More fun than the original article to my taste :)

Here's the link, since Googling "floating bar" nets you very different results:

http://www.iquilezles.org/www/articles/floatingbar/floatingb...

tomxor · on June 15, 2019

I just remember his name, writes lots of interesting stuff :)

romwell · on June 17, 2019

I always wonder if he ever says "My name is Inigo Quilez. Prepare to learn!".

My favorite post by him is where he explains that you never need to have trigonometry calls in your 3D engine. Because everyone now and then you still see "educational" article in the spirit of "learn trig to spin a cube in 3D!" :/

tomxor · on June 18, 2019

Yeah, I've noticed in physics programming in other peoples code you can often come across the non-trig ways of doing things that avoid unit vectors and rooting etc (which are both elegant and efficient)... However i've never personally come across an explicit explanation or "tutorials" of these targeted at any level of programmer's vector proficiency. Instead i've always discovered them in code, and had to figure out how they work.

I guess the smart people writing a lot of this stuff just assume everyone will derive them as they go. That's why we need more Inigo Quilez's :D to lay it out for us mere mortals and encourage it's use more widely.

gumby · on June 14, 2019

Common Lisp has had a rational number Type since the 1980s at least, presumably inherited from MACSYMA

astrobe_ · on June 14, 2019

Then you should hate integer too.

tomxor · on June 14, 2019

"integer" as a type is less offensive though, as it only has one intuitive deficiency compared the mathematical definition (finite range). Where as "real" as a type has many deficiencies... it simply does not contain irrational numbers, and it does not contain all rational numbers in 3 respects: range, precision and unrepresentable due to base2, and for integers it can represent a non-contiguous range!

ska · on June 14, 2019

countable vs. uncountable deficiencies...

ben509 · on June 14, 2019

But I can build a computer that would accept any pair of integers you sent it, add them and return the result. Granted, they'd have to be streamed over a network, in little-endian form, but unbounded integer addition is truly doable. And the restriction is implicit: you can only send it finite integers, and given finite time it will return another finite answer.

You can't say the same about most real numbers, all the ones that we postulate must exist but that have infinite complexity. You can't ever construct a single one of them.

Dylan16807 · on June 15, 2019

I'm not sure what your point is. It's impossible to ask the computer to store those numbers, so why does it matter whether they can be stored?

Or in other words: It can store all the non-computable numbers you give it. That should suffice.

rwallace · on June 14, 2019

The way I see it, 'int' as a truncation of 'integer' nicely symbolizes the way the type truncates integers to a fixed number of bits.

Of course any self-respecting modern language should also provide arbitrary-precision integers at least as an option.

astrobe_ · on June 15, 2019

It is not a matter of modernity, but a matter of use-case. You don't need arbitrary-length integers for a coffee machine or a fridge. You don't even need FP to handle e.g. temperatures; fixed-point is often more than enough. So if you are making some sort of "portable assembler" for IOT devices, you can safely stick with simple integers.

jancsika · on June 14, 2019

> I think that illusion can be a bit dangerous when we create things or use things based on that incorrect assumption.

I'd be curious to hear some of the problems programmers have run into from this conceptual discrepancy. We've got probably billions of running instances of web frameworks build atop double precision IEEE 754 to choose from. Are there any obvious examples you know?

karmakaze · on June 14, 2019

Operations that you think of being associative are not. A simple example is adding small and large numbers together. If you add the small numbers together and then the large one (e.g. sum from smallest to largest) the small parts are better represented in the sum than sum from largest to smallest. Could happen if you have a series of small interest payments and are applying it to the starting principal.

Demiurge · on June 14, 2019

I've worked with large datasets, aggregating millions of numbers, summing, dividing, averaging, etc... and I have tested the orders of operations, trying to force some accumulated error, and I've actually never been able to show any difference in the realm of 6-8 significant digits I looked at.

ryani · on June 14, 2019

Here's a common problem that shows up in implementations of games:

    class Time {
        uint32 m_CycleCount;
        float m_CyclesPerSec;
        float m_Time;
        
    public:
        Time() {
            m_CyclesPerSec = CPU_GetCyclesPerSec();
            m_CycleCount = CPU_GetCurCycleCount();
            m_Time = 0.0f;
        }

        float GetTime() { return m_Time; }

        void Update() {
            // note that this is expected to wrap
            // during the lifetime of the game --
            // modular math works correctly in that case
            // as long as Update() is called at least once
            // every 2^32-1 cycles.
            uint32 curCycleCount = CPU_GetCurCycleCount();
            float dt = (m_CycleCount - curCycleCount) / m_CyclesPerSec;
            m_CycleCount = curCycleCount;
            m_Time += dt;
        }
    };

    void GAME_MainLoop() {
       Timer t;
       while( !GAME_HasQuit() ) {
           t.Update();
           GAME_step( t.GetTime() );
       }
    }

The problem is that m_Time will become large relative to dt, the longer the game is running. Worse, as your CPU/GPU gets faster and the game's framerate rises, dt becomes smaller. So something that looks completely fine during development (where m_Time stays small and dt is large due to debug builds) turns into a literal time bomb as users play and upgrade their hardware.

At 300fps, time will literally stop advancing after the game has been running for around 8 hours, and in-game things that depend on framerate can become noticably jittery well before then.

Dylan16807 · on June 15, 2019

Though inside game logic you should probably default to double, which would easily avoid this problem.

ryani · on June 15, 2019

If I'm going to use a 64 bit type for time I'd probably just use int64 microseconds, have over 250,000 years of uptime before overflowing, and not have to worry about the precision changing the longer the game is active.

Dylan16807 · on June 15, 2019

So using fixed point. You could do that, but you can't make every single time-using variable fixed point without a lot of unnecessary work. Without sufficient care you end up with less precision than floating point. If you don't want to spend a ton of upfront time on carefully optimizing every variable just to avoid wasting 10 exponent bits, default to double.

tomxor · on June 14, 2019

> in the realm of 6-8 significant digits I looked at

That is far inside the range of 64bit double precision. For error to propagate up to that range of significance depends on the math, but i doubt the aggregation you are describing would cause it... provided nothing silly happens to subtotals like intermediate rounding to precision (you'd be surprised).

Something like compounding as the parent was describing are far more prone to significant error propagation.

karmakaze · on June 14, 2019

I've seen rounding errors on something as simple as adding the taxes calculated per line item vs calculating a tax on a subtotal. This was flagged as incorrect downstream where taxes were calculated the other way.

In a real-life transaction where pennies are not exchanged this could mean a difference of a nickel on a $20 purchase which isn't a meaningful difference but certainly not insignificant.

Demiurge · on June 14, 2019

How much was the difference? Was there any rounding involved at any step? When dealing with money, I see rounding and integer math all the time. As another comment has mentioned, within 53 bits of mantissa, the number range is so big, we are talking 16 digits. I''d be curious to see a real-world example where the float math is the source of error, as opposed to some other bug.

karmakaze · on June 14, 2019

It doesn't take much imagination to construct an example. Knowing 0.1 isn't exactly represented make a formula with it that should be exactly a half cent. Depending on fp I precision it will be slightly above or below a half cent and rounding will not work as expected. We found this in prod at a rate of hundreds to thousands of times per day it only takes volume to surface unlikely errors.

karmakaze · on June 14, 2019

I thought I was clear that the result was rounded. The fp imprecision affects the correctness of rounding e.g. half to nearest even.

Demiurge · on June 14, 2019

It is the rounding that seems to be an issue here.

civility · on June 14, 2019

    x = 1e-20 + 1e20 - 1e20
    y = 1e20 - 1e20 + 1e-20
    assert x == y

firethief · on June 14, 2019

That's like the first rule of floating point numbers though: don't do that

civility · on June 14, 2019

The person I replied to claims to have looked at large* data sets and never seen a discrepancy in 6-8 significant digits. I thought I'd show them a small data set with 3 samples that retains no significant digits.

* Never mind that "millions" isn't large by current standards...

firethief · on June 14, 2019

But you are observing all the digits, not just 6-8. It's implicit in the semantics of the operation, and that's something everyone who works with floating point should know.

civility · on June 15, 2019

I think I see where you're hung up. If I change it to:

    x = 1e20 + 1 - 1e20
    y = 1e20 - 1e20 + 1
    assert x == y

There is only 1 digit, and it's wrong. You don't even need 6-8. I probably should've used this as my example in the first place.

firethief · on June 15, 2019

You're making the same mistake but now it's less obvious because the change of scale. When you compare floating point numbers, simple == is not usually what you want; you need to compare them with a tolerance. Choosing the tolerance can be difficult, but in general when working with small numbers you need a small value and with large numbers you need a large value. This dataset involves datapoint(s) at 1e20; at that magnitude, whatever you're measuring, the error in your measurements is going to be way more than 1, so a choice of tolerance ≤ 1 is a mistake.

civility · on June 15, 2019

Ugh, you're preaching to the choir. I wasn't trying to make a point about the equality operator, I was trying to make a point about x and y being completely different. I must be really bad at communicating with people.

firethief · on June 15, 2019

That's the thing: x and y are not "completely different"; they are within the tolerance one would use to compare them correctly.

civility · on June 16, 2019

So umm, more than a "difference in the realm of 6-8 significant digits"?

firethief · on June 16, 2019

That construction can turn any residual N digits out into a first digit difference. It wouldn't matter without making comparisons with tolerance way under the noise floor of the dataset. But yes, you have technically invented a situation that differs from that real-world anecdote in regard to that property, in an extremely literal interpretation.

civility · on June 17, 2019

And here I was, worried I might be tedious or pedantic, trying to argue that floating point is just not that simple. You've really outdone me in that regard.

vbezhenar · on June 14, 2019

JavaScript is precise for 2^53 range. It's unlikely that you're operating with numbers outside of that range if you're dealing with real life things, so for most practical purposes doubles are enough.

civility · on June 14, 2019

> JavaScript is precise for 2^53 range

What does this mean to you? It's very easy to get horrible rounding error with real-life sized things. For instance

    document.writeln(1.0 % 0.2);

The right answer is 0.0, and the most it can be wrong is 0.2. It's nearly as wrong as possible. These are real-life sized numbers.

btw: I think IEEE-754 is really great, but it's also important to understand your tools.

tomxor · on June 15, 2019

> document.writeln(1.0 % 0.2);

> The right answer is 0.0, and the most it can be wrong is 0.2. It's nearly as wrong as possible.

Just to clarify for others, you're implicitly contriving that to mean: you care about the error being positive. The numerical error in 0.1 % 0.2 is actually fairly ordinarily tiny (on the order of x10^-17), but using modulo may create sensitivity to these tiny errors by introducing discontinuity where it matters.

civility · on June 15, 2019

Call it a contrivance if you want, my point was just that you can get very surprising results even for "real life" sized numbers.

Not sure why you changed it from "1.0 % 0.2" to "0.1 % 0.2". The error on the one I showed was near 0.2, not 1e-17. Did I miss your point?

tomxor · on June 15, 2019

I mistakenly used 0.1 instead of 1.0 but the _numerical_ error is still x10-17, the modulo is further introducing a discontinuity that creates sensitivity to that tiny numerical error, whether that is a problem depends on what you are doing with the result... 0.19999999999999996 is very close to 0 as far as modulo is concerned.

I'm not arguing against you just clarifying the difference between propagation of error into significant numerical error through something like compounding; and being sensitive to very tiny errors by to depending on discontinuities such as those introduced by modulo.

vbezhenar · on June 14, 2019

I'm talking about integer numbers. 2^53 = 9007199254740992. You can do any arithmetic operations with any integer number from -9007199254740992 to 9007199254740992 and results will be correct. E.g. 9007199254740991 + 1 = 9007199254740992. But outside of that range there will be errors, e.g. 9007199254740992 + 1 = 9007199254740992

tomxor · on June 15, 2019

You are are describing only one of the types of numerical error that can occur, and it is not commonly a problem: this is only an edge case that occurs at the significand limit where the exponent alone must be used to approximate larger magnitudes at which point integers become non-contiguous.

The types of errors being discussed by others are all in the realm of non-integer rationals where limitations in either precision or representation introduce error and then compound in operations no matter the order of magnitude... and btw _real_ life tends to contain _real_ numbers, that commonly includes rationals in use of IEEE 754.

civility · on June 15, 2019

I didn't see anything in your message above indicating you were restricting yourself to integers.

karmakaze · on June 14, 2019

Actually this is a source of many issues where a 64bit int say a DB autoinc id can't be exactly represented in a js number. Not a inreallife value but still a practical concern.

recursive · on June 14, 2019

I spent a day debugging a problem this created. Without going into irrelevant domain details, we had a set of objects, each of which has numeric properties A and B. The formal specification says objects are categorized in a certain way iff the sum of A and B is less than or equal to 0.3.

The root problem, of course, was that 0.2 + 0.1 <= 0.3 isn't actually true in floating point arithmetic.

It wasn't immediately obvious where the problem was since there were a lot of moving parts in play, and it did not immediately occur to us to doubt the seemingly simple arithmetic.

tomxor · on June 14, 2019

I can't show you, but my job involves writing a lot of numerical grading code (as in code that grades calculated student answers in a number of different ways). I've had the pleasure of seeing many other systems pretty horrible attempts at this, both from the outside and in, in both cases numerical errors rooted in floating point math abound. To give an easy example, a common numerical property required for building formatters and graders of various aspects (scientific notation, significant figures etc), is the base 10 floored order of magnitude. The most common way of obtaining this is numerically using logarithms, but this has a number of unavoidable edge cases where it fails due to floating point error - resulting in myriad of grading errors and incorrect formatting by one sf.

These are an easy target to find issues that _matter_ because users are essentially fuzzing the inputs, so they are bound to find an error if it exists, and they will also care when they do!

When these oversights become a problem is very context sensitive. I suppose mine is quite biased.

Const-me · on June 14, 2019

Web frameworks usually don't need floats at all. Developers working on other things encounter these problems all the time.

https://randomascii.wordpress.com/category/floating-point/

https://stackoverflow.com/questions/46028336/inconsistent-be...