Hacker News new | past | comments | ask | show | jobs | submit login
Why does 0.1 and 0.2 = 0.30000000000000004? (jvns.ca)
339 points by soheilpro on Feb 8, 2023 | hide | past | favorite | 361 comments



Both Excel and Google Sheets return FALSE for this expression:

    2.03 - 2 - 0.03 = 0
The vast majority of data transformation and BI tools (Power BI, PowerQuery, Tableau, etc.) return FALSE for this expression:

    0.1 + 0.2 = 0.3
That's because they use floats instead of decimals and that introduces subtle errors in data. These errors usually never get noticed because everyone doesn't expect errors in basic math. It's a mystery to me why most commercial software intended for business and financial calculations don't use fixed point decimals. My post about this: https://www.linkedin.com/feed/update/urn:li:activity:7028101...

PS. If you design software that works with money amounts, always use fixed point decimals. Don't use floats, it's just wrong!


Paging Colin Percival to talk about picodollars. The medium strength advice on this topic is to use integer math with cents as your unit. Colin’s advice is to choose the smallest possible unit you can that will avoid overflow (I think) hence why he prices tarsnap in picodollars: https://www.tarsnap.com/picoUSD-why.html


Picodollars are awesome, but I actually do Tarsnap's accounting in attodollars since it's convenient to use a 64-bit unsigned integer to represent units of 10^-18.


This is something people are living with because it's very rare to use exact equality tests on floats in BI applications to begin with. Far more people want to look at the sum of order amounts, or at orders where the amount is within a certain range, than at orders where amount is exactly equal to some random float.


People are living with it until they stumble when an innocent expression

   (a + b) >= c 
suddenly fails to work. For instance, in Excel the expression

   A1-B1-C1 >= 0
returns FALSE when A1=2.03, B1=0.03, and C1=2. Such an expression can be used for instance to filter records that fall within a certain range, and that filter would produce wrong results.


It would lead to some hilarious malpractise if anyone understood floating point and stole a billion cents. Things like this has happend and will continue if people arent aware of floating point error.


> of floating point error.

Worth noting that this is a risk whenever there is rounding, if this chain of logic happens with cents as ints, it is still wrong:

    1 / 2 => 0
    0 * 2 => 0
Therefore:

    1 == 0


> If you design software that works with money amounts, always use fixed point decimals. Don't use floats, it's just wrong!

Don’t write with such certainty! Decimal math is great advice for many/most situations, but what if you have a LOT of numbers and not a lot of time? That big number crunching GPU is not available if you take this approach.

Numerical Methods was the most difficult CS course I took in university and also the one I did the worst in. And of course it was an elective, or else they wouldn’t have graduated many people at all. If you’re doing a lot of number crunching stuff, maybe you should ask people that know how to number crunch to design your system so it has the smallest errors :)

PS - I’m not the person for that job!


I am almost completely certain that neither Google Sheets, nor Excel use that "big number crunching GPU" for anything float-related.

Just getting the data to the GPU for a compute shader to run on it would take longer than just doing on the CPU in almost every case.


I'm not sure if the parent poster meant GPU or CPU. While yeah GPU has good floating point math, you probably don't use that for things like excel.

On the other hand the CPU has floating point math too, and floating point math is MUCH faster than decimal math.

So his point holds if you replace GPU with CPU, but his use of GPU is likely inaccurate.


Ah, yeah. The performance of modern FPUs really kills, and it's changing a lot of the assumptions I used to have about performance.

Anecdote: In state of the art physics code, it's common to use integers for coordinates [1]. Recently, I was working on a BSc. toy project, and I used ints for coordinates as is common. With both ints and floats it's important to be careful around functions like `tan` that go to infinity, but of course floats are more forgiving, so I prototyped some of the code using doubles.

I ended up comparing performance and it wasn't even funny. Double precision arithmetic was anywhere between 3x (where a good int algorithm is known) all the way to 100x faster (if the int algorithm is cordic, for example) than integers.

1: Springel 2013 p. 1-82 https://wwwmpa.mpa-garching.mpg.de/gadget4/gadget4-code-pape...


> use of GPU is likely inaccurate.

The number 1 supercomputer in the world has 8 million GPU cores (or is it 8 million GPUs? Execution Units?) and 600k CPU cores.

They are doing floating point math on GPUs. Numerical analysis [1] is used to create the best accuracy possible. This is something that has been done for thousands of years, basically since math has had problems with no "exact" solution.

[1] https://en.wikipedia.org/wiki/Numerical_analysis


> you probably don't use that for things like excel

I'm not sure when I read it, but long ago I read some kind of AMA from a MS engineer working on Excel saying that his greatest achievement was working on the team that made the Excel's DAG solver trivially parallelizable. In the same thread, he mentioned that offloading to the GPU was being looked into. I guess it never came to fruition.


We should be optimizing for correctness over performance by default, though. People who need binary floating point for perf reasons should already know the tradeoffs.


I remember reading on a forum many years ago where one person was rationalising performance over correctness, someone replied that if their code didn't have to be correct they could get the answer in clock 1 cycle with zero memory usage. :-)


Numbers (from Apple) returns TRUE for both expressions you list


Objective-C has a really nice decimal math library. Last I looked (it’s been a while), Swift didn’t. It might have one by now.

Years ago, I had a few apps in the App Store that made extensive use of it. It really was very nice to work with.


Swift has the same functionality in the Decimal type.


Thanks! It looks like that was added in Swift 3. That was a long time ago. I guess I’m just really old ;)


Under the hood, modern Numbers stores values in Decimal128 (16 bytes)


The really surprising* thing is that Google Sheets uses floats for everything. Back in the day, I was using Sheets to do some statistics about (IIRC) kernel ASLR on macOS, and I was surprised to see kernel pointers ending in impossible digits. Of course only after I'd wasted 2 hours on it.

Boy, did I file a pissy bug with the Sheets team that day, and then requested an Excel install I never let go of since that day.

* I guess it's maybe not surprising to js developers, but don't most modern browsers have integers by now?


> Google Sheets uses floats for everything (...) and then requested an Excel install

But Excel also uses floats (as in, Single-precision floating-point format, aka float32), because of "compatibility with Lotus 1-2-3".

https://softwarerecs.stackexchange.com/questions/53292/any-s...

https://learn.microsoft.com/en-us/office/troubleshoot/excel/...


Excel uses floats in its own ways that even experts on floating-point arithmetic find inscrutable[1].

[1] https://people.eecs.berkeley.edu/~wkahan/Mindless.pdf, §2


Ah, but it used ints (or bignum) to represent my pointer values, instead of silently losing precision. This is an anecdote, maybe there are other gotchas I am blissfully ignorant of. (Also, this was 8 years ago.)


How do Sheets and Excel differ in this regard? How did using floats cause some number to be odd?


GP didn't say odd, or at least not numerically odd, but rather "ending in impossible digits".

Floats have varying precision, not uniform. The closer the number is to zero, the more precision it has. There's a point where the precision is so low that it only covers whole numbers, and then a point after that where the precision is less than whole numbers.

Given the context of pointers, it's quite possible that they were large enough to reach that less-than-whole-number-precision range.

EDIT: For 32-bit floats, that's apparently any number above 16,777,216. Which seems surprisingly small to me. For 64-bit doubles that goes to 9,007,199,254,740,992.

https://blog.demofox.org/2017/11/21/floating-point-precision


I think the comment may have been edited ;)

Anyway, yes, that's all true. Excel, Sheets, JavaScript numbers, Java doubles, Python floats, and so on all work this way. That's why I asked how switching spreadsheet implementations solved the problem.


Ah, sorry for that I must have seen the updated one.

I agree with you that switching programs that are all using the IEEE floating-point formats with the same underlying hardware implementations should result in the same answers.


besides bigInt...

What numbers in javascript are not actually doubles (what js uses under the hood for every number)?

You can convert them from doubles to 32-ints apparently by some bitwise hacks like (I'm pretty sure they still are 'doubles' though - and it just does some rounding tricks)

``` |0 //signed >>>0 //unsigned ```

So any webapp (google sheets) will likely have problems related to floating point math.


> You can convert them from doubles to 32-ints

You can use the native BigInt arbitrary-precision integers and ignore all "fits under this arbitrary limited bit slice" problems.

You can use doubles as a integer directly, in a safe way, until you touch the 53 bits barrier (Number.MIN_SAFE_INTEGER === -9007199254740991 or Number.MAX_SAFE_INTEGER === 9007199254740991).

None of those really are (modern) webapp issues.


The equality operator should not be implemented for the float type to being with. Money amounts should use integer.


A true decimal type is better than integer, if your language supports it.


Decimals are notoriously hairy to implement and it's not obvious how they should behave when they run out of precision. Integers are almost always the better choice when the decimal is fixed, such as with currency.

(I guess it depends on what you mean by "true" decimal. If you meant BigNum, then sure.)


A lot of times, though, currency is not fixed. Stock share prices can be in fractions of a cent; per-unit cost numbers can easily have fractional cents too, just for a couple of quick examples. Sure, if you're just looking at your bank account, or itemizing actual transactions, they're going to be in whole cents only, but lots of financial calculations need more precision than this.


Having different numeric types for currency amounts (integer) and for costs or prices (floating point, whether decimal or binary) is not undesirable. Heck, it's very likely a bonus.


That sounds like a recipe for disaster. If you have fractional unit prices, and multiply by the number of units, floats can still give you errors. No, fixed decimal types definitely have their uses.


Decimals are notoriously hairy to implement

Only floating-point decimal is hairy. Fixed-point decimal is hardly more difficult than integer math.


If you simply look at any of the real world implementations, you'll see it's not so easy. Rust has a few readable and educational crates that implement decimals, if you want to take a look.


> Decimals are notoriously hairy to implement

That's because the binary is implemented in hardware for you; and those days you can usually (but not always) trust the hardware to do the correct thing; if you have to implement it youself (say, deterministic, side-effect free math), it's also hairy.


Decimal floating point is fine so long as you constrain the exponent such that delta between two consecutive valid numbers can never be greater than 1 (that is, there should never be any implicit trailing zeroes). .NET Decimal type is a good example.


That's assuming it's implemented as a fixed-type, while my understanding is most follow the IEEE 754 standard that uses floating-point types.


Fixed point fails for just about any financial calcs beyond simple addition. Try doing common compound interest calcs for example, and you'll get much worse answers.

The correct answer is to use floating point and to understand it and numerical software before doing it. If you don't have a decent understanding of numerical analysis, don't write important numerical software.


Well, but the "common" compound interest calcs aren't how compound interest is actually calculated. You use floating point to calculate the amount due, but once you do, you round to a fixed point (e.g. whole cents) and that's it, that's the final truth. The interest accrued, interest due and interest paid out is a fixed point value by definition. And then for the next interest period you start with that rounded off value as the basis for calculating the next interest which compounds.

So yes, if you use fixed point you obviously get different results, but you won't get correct results according to what's required by accounting standards unless you truncate to fixed point when needed. It's not that this difference is large - after all, it's about rounding off fractions of a cent - but accounting does need to be exact.


> aren't how compound interest is actually calculated.

Completely depends - package them in tranches to sell to secondary markets by the thousands - then you do exactly as I pointed out. Or if you're doing Monte Carlo futures projections modeled as compound interest and only need the value at a future time.... Or any of thousands of financial modeling needs....

If you're printing monthly bills for consumers then you round, but only at output, and only for viewable parts.

So you cannot claim things are not computed this way - it depends on the financial application you're working on,

>So yes, if you use fixed point you obviously get different results, but you won't get correct results according to what's required by accounting standards

Ha - which standard are these? Care to cite them? I've been through this space a long time, and every time someone tells me there is a standard and I ask them for it they soon realize there is no gold, single standard. There are zillions of acceptable choices. There are ones for consumers, ones for intrabank, interbank, fed to bank, loans, mortgagaes, taxes, and on and on. There is no "correct results according to what's required by accounting standards ".

Please cite your standards that apply to all these cases.

Have you worked in finance on numerical financial software?

>you round to a fixed point (e.g. whole cents) and that's it, that's the final truth

Having done numerical stuff, including finance for decades, you simply write the entire codebase in floating point, being sure to do proper analysis that things handle ranges correctly.

Then, and only for output, do you snap to desired observable precision. Never ever even once do you round something to make it look pretty, then jam it back into calculations.


The difference seems to be that you are talking about modeling and analysis and I was talking about calculating actual interest, as in the interest that is actually owed for a particular loan or contract (i.e. every particular contract) and properly accounting for that.

It doesn't matter if the interest is calculated for consumers, intrabank, interbank, fed to bank, loans, mortgages, taxes - the interest rates and interest periods and interest day basis and all kinds of details may vary, and of course accounting standards vary between countries, but the core principles are the same that it all eventually comes up to some amount of money owed to the counterparty - measured in whole cents or perhaps whole dollars or roubles or whatever, but never an arbitrary-precision float. You can't get to a final compounded result "only for output" because when interest compounds (for example, monthly) then at every such point you do have actual "intermediate output" which materializes into a customer-visible change of balance from which the next period's interest is then calculated, and that intermediate output gets rounded because that (unlike estimates or models) is a specific balance owed and it is denominated in a currency with fixed, limited precision. And so after many such steps, the total actual compound interest - i.e. the actual dollars and cents paid by (or to) the counterparty - is slightly different from what the common modeling approach gets if e.g., as you state, it does rounding only for the final output. The difference is tiny, so there is no problem for modeling to ignore it, but actual financial systems (i.e. tracking facts of money owed, not doing estimates and models for decision support) do have to come down to a rounded fixed-precision number owed for every contract at the end of every day.


There’s also a middle ground, which I find more useful than either full-floating or full-fixed. Use floating for intermediate calculations and fixed-only with autorounding for “fields”. So that:

  var t = obj.x // fixed -> fp
  for (<n times>) {
    t += 0.1
  }
  obj.x = t // adds exactly n/10
The key observation is that intermediates don’t float free long enough before being assigned back to a fixed storage. So the error has no chance to accumulate. But still can manifest in comparisons. If necessary, floating point can be replaced with precise enough fixed point for intermediates (at a cost, e.g. tens of digits).

The correct answer is to use floating point and to understand it and numerical software before doing it

Yes, but this also has associated costs and risks. Unless you’re pressed against some wall, it’s more safe to offload that to a runtime. Humans are way too unreliable when it comes to understanding numerical software.


Never ever ever ever perform summation like that. You just added O(n) error instead of O(1) error.

This is why ad-hoc methods that people feel are ok based on untrained or not-carefully-studied analysis should not write production numerical code.

Just use floating point everywhere, and only output things snapped to whatever resolution you want (and even that is tricky). Otherwise all those fixed to float to fixed to float going on in your code are going to add all sorts of numerical problems - each loses information.


I believe you missed the part where it rounds the error away. Of course for high N it may overflow into a significant part, but that’s 25-30 bits away for `double` and much more for custom non-fixed types, which implies a dataset that a client app wouldn’t be able to handle anyway. Multiplication is another beast, but repeating multiplication doesn’t appear in finance naturally.

In case you did not miss it: I’ve worked with and supported financial systems which do exactly that for a very long time without any micro-numerical issues^. Otoh, floating-point is a constant source of microbugs, unless all your developers are Knuth-level pros who are also versed in consulting and have no monday mornings or deadlines. It doesn’t matter if an error is O(N < 1e6) or O(1) when an underwater comparison to a limit fails and control flow triggers randomly.

^ The last [few] cent problem is usually handled explicitly, either naively (last=rest) or in Bresenham-like way (rarely, when it matters) and is easily catched in an accounting balance when left unhandled


>I believe you missed the part where it rounds the error away.

I've been down this argument with other HN commenters some time back and explicitly demonstrated that it fails when you do it this way. It's not worth chasing down all the details again.

The short answer is it will fail, and in unexpected places. The only correct answer when doing this to is do the numerical analysis completely and correctly. This half-assed "it rounds the error away" is completely insufficient (and wrong).

The problem with letting such error slop around in code is that someone will take your code and use it to aggregate 1m loans, then your 25 bits of safety just became real money. Then someone will leverage that routine and add more problems.

When you build the lowest pieces so sloppily, it quickly contaminates the whole system. Make each piece as numerically solid as possible, otherwise you will get bitten.

If you have not proven your algorithm correct using numerical analysis for this stuff, it is not correct. End of story.

>but repeating multiplication doesn’t appear in finance naturally

Yes it does - compound interest if you need periods and tables.

And we're in agreement - floating point, not fixed point, is how to do financial calculations. I'm amazed how many people on HN want to argue that fixed point works when it's easy to demonstrate it fails in terrible ways and is significantly more error prone than simply using doubles (or double- or quad- doubles when needed).


Maybe it will, I’m only halfway there. I’ll take the risk, cause your solution (hiring theorem provers) simply costs much more than the risk itself upfront.

Yes it does - compound interest if you need periods and tables.

Only if you don’t round to fixed before capitalizing. But when you don’t, numerically less savvy investors (99.9% of people) would just ask to fix it and stop being so smart. They want deterministic output for any particular end of period.

I see you’re coming from academic side, but real world doesn’t work like that. Nobody’s going to take our algorithms and shove 2^(>20) records of sums greater than $100M into them.


Even if you set the data format to 'currency' it still returns false (in Google Sheets). I realise they probably want consistency but it's weird they don't have an option to use a decimal type.


It is my impression of most spreadsheet software that the “data format” is more about display, and not about actually strongly typed representations of the data.


> It's a mystery to me why most commercial software intended for business and financial calculations don't use fixed point decimals

All the reporting software I have seen in banks use decimals for adding numbers.

If you are adding small numbers together, those errors are negligible and get rounded out in the result (you usually can't pay an amount with more than two decimals). It's only a problem if somehow you are doing some calculations that need to be exact on amount large enough that the float rounding starts affecting your pennies.

Financial reporting has materiality thresholds, no one cares about pennies if the size of a balance sheet is trillons, the materiality will likely be in millions, not the least because the numbers will be shown in millions in the financial report and the numbers won't be additive because of rounding) and for a BI tool a number with 12 digits is unreadable and too much information to be useful.

If you are doing pricing, also no one really cares about pennies on a 1 million payment.

> PS. If you design software that works with money amounts, always use fixed point decimals. Don't use floats, it's just wrong!

Well, it depends. If all you do is add and substract numbers, ok, and that's what they typically do. If you need to do any other calculation (and most financial software does), this will bite you as percentages and ratios will be rounded aggressively, and multiplying large amounts will overflow.


In scheme, (= (+ (/ 2 10) (/ 1 10)) (/ 3 10)) is #t.


The problem is that fixed decimal types are far less standard in many programming languages. One thing I really loved about Groovy is that BigDecimal is the standard type for decimal numbers. Type `1.5` and you will have a BigDecimal rather than a float in your hands.

Not very suitable for complex scientific calculations of course, but perfect for web development which is more likely to deal with money than scientific calculations.


> PS. If you design software that works with money amounts, always use fixed point decimals. Don't use floats, it's just wrong!

Fundamental reason being that powers of 1/10 in decimal fractions are not always representable by powers of 1/2 in binary. Specifically 0.1 = 0b00011001100110011... is infinitely long binary string. Truncating it to any finite bit-width like 32 or 64 bits always introduces an error.


Because perf was super important for a spreadsheet in the 90s, and now back compatibility is super important.

You do not want your numbers changing in the next version of Excel.


I don't think he meant Excel. You can't use decimals in excel. First you will overflow immediatly (like multiply two numbers in trillons). Then your rounding will kill small amounts (like percentages). There is no alternative to floats in excel.


The predecessor to excel (multi plan) used binary coded decimal.

But there’s nothing stopping you from doing arbitrary precision decimal in excel except back compatibility (and all the thousands of lines of code looking for a float).


I learned about binary coded decimal in school and it was the weirdest thing but is pretty good for money.


Doing mostly bcd math is unfortunately one of the reasons why the old style HP calculators were dog-slow. It also did not help that those old calculators like rounding values too much.


Not calculators, but I did some tests a few years ago, and I'm pretty sure the BCD instructions on modern processors are no longer implemented in hardware. They were no different than using an equivalent string of other opcodes.


kalker.xyz seems to return true


> If you design software that works with money amounts, always use fixed point decimals. Don't use floats, it's just wrong!

I find it funny the gap between what computer people think finance requires and actual practice.

The tax people in the US generally aren't interested in pennies any more! And when you use tax software that throws away the pennies before the final results, then your sums may very well not match the forms whose information is independently reported to the IRS, by well over a dollar. But nobody cares!


No need to invent a divide between "computer people" and "tax people," whoever they are. Maybe the IRS allows small errors because bugs attributable to floating-point precision are too hard to fix.


AFAIK the IRS has always been accepting of truncating / rounding values to full dollars and discarding cents. It's certainly been that way for the decades that I've been doing taxes.


In Australia the tax office doesn't event let you enter cents in online fields.


My point was that (at least some) standard US tax software rounds to dollars on intermediate results, apparently permitted and required by the IRS.

...while financial institutions send reports to clients and the IRS that have the totals rounded only at the end, meaning the things that are supposed to match don't and can't.

Even though they are both in dollars, they are not consistent even to the nearest dollar.


PS. If you design software that works with money amounts, always use fixed point decimals. Don't use floats, it's just wrong!

Eh, doubles are fine for (most) currencies. Just don't do comparisons without appropriate epsilons.

People who compoare floating-point numbers for equality are going to make other fundamental mistakes with whatever data type you force them to work with.


I thought that until literally two days ago. Turns out that if you sum a bunch of 0.37 (not even huge numbers, just around a few thousand) you end up with differences on the order of 10-20. Both in mysql and in Java. No, this doesn't makes sense to me either - the differences should be a LOT farther than 3 digits. And yet.

You should have seen my face when debugging this.


I'm curious about this. Could you provide an example?

Here's what I'm seeing:

---

JavaScript

    (function() {
      let inc = 0.37;
      let times = 5000;
      let total = 0;
      for (let i = 0; i < 5000; i++) {
        total += inc;
      }
      console.log('expected:', inc \* times);
      console.log('actual:', total);
    })();
expected: 1850

actual: 1849.9999999997679

---

Java

    public class MyClass {
        public static void main(String args[]) {
            double inc = 0.37;
            double times = 5000;
            double total = 0;
            for (double i = 0; i < 5000; i++) {
                total += inc;
            }
            System.out.println("expected: " + (inc \* times));
            System.out.println("actual: " + total);
        }
    }
expected: 1850.0

actual: 1849.9999999997679

---

update: Changing `double` to `float` in Java yields:

expected: 1850.0

actual: 1849.9778

and maybe that lines up with what you meant by "the differences should be a LOT farther than 3 digits", though it's hard to tell what you mean by "differences on the order of 10-20".


same in SQL Server

   declare @i float =0.37,@times int = 0,@total float = 0

   while @times < 5000
   begin
   set  @total+=  @i
   set @times +=1
   end

   select @total
1849.99999999977

or

   select sum(t) from(
   select top 5000 convert(float, 0.37) t 
   from sysobjects a cross join sysobjects b) z
1849.99999999977


something like, in mysql: create table t ( value double(10,2) not null ); put a bunch of values, something like 15k with maybe half of them being 0.37.

Do the sum two ways:

- export them in excel and do a sum

- select sum(value) from t

and you get two different values, with a difference of something like 15.

Replace double(10,2) with decimal(10,2), and the sum(value) works correctly.

I still have no idea why this happens. I could expect a tiny difference, and code like your above will indeed give differences like this, well under 1.0, but my particular scenario had differences many orders of magnitude higher. Still can't explain.


something like 15k with maybe half of them being 0.37

Oh, that's something entirely different. If some of your other numbers are much larger than 0.37, you will run into a different kind of problem.

You know about exponent and mantissa? Adding a value with a small exponent to a value with a big exponent will cause imprecisions. In extreme cases the result is just the bigger number.


Somebody is wrong on the Internet today. Very wrong.


If I want to know the monthly payment on a $500,000 mortgage for 30 years at 6%, should I use:

1. Decimal math

2. Binary floating point math

3. Domain knowledge trumps all; it makes no difference


Oh hey I work in mortgage! We use plain old doubles for everything and there's some rounds in there and occasionally an exotic round to the nearest 1/8th. Everything ends up matching fine where it needs to.


Then of course you'll have to work back to compute an APR for that by a crazy iterative formula outlined by the US gov. Which would you use for that process?


irr? The library we use uses doubles https://github.com/eric-malachias/irr. We also use doubles for the amortization table and doubles for everything.


Accounting and banking does not allow epsilons. You always need to be able to account for every last cent on every account and every transaction. And the precision you need is fixed, so there is no purpose in using floating points.

I guess there are some contexts where floats are fine for monetary amounts, for example if you make economic forecasts or simulations. As long as the amounts are not real-world transactions, floats are probably fine.


You always need to be able to account for every last cent on every account

Then that's your epsilon. Or, more reasonably, one or two powers of 10 further down. Absolutely nobody cares about a millionth of a cent. A few people care about a thousandth, though. Most people get antsy if you can't pin a calculation down to the nearest cent.

Floats are pretty much never fine for currency, but doubles are usually OK. There's a world of difference between a 24-bit mantissa and a 53-bit mantissa.


The problem is that some decimal amounts, e.g. 10 cent ($ 0.10) cannot be represented precisely using binary floating point. It doesn't matter how many bits you use since 0.1 have an infinite expansion in binary.

This is exactly what the article describes and explains. 10 cent plus 20 cent does not equal 30 cent, when using binary floating point. This is not acceptable in accounting, since at one point the error may accumulate and cause an error at the size of a cent (or more).


Currency quantities are inherently exact. If you're doing epsilon comparisons on currency quantities, you are making a fundamental ontological error.


Nobody gives a hoot about 0.000001 cents. Round to the nearest 1/100 cent after adding or subtracting doubles, and you will be fine in 99.99999% of applications.

The cardinal sin isn't using doubles for currency; it's using them without understanding either the tool or the job that you're asking the tool to perform.


I've been using doubles for currency knowing full well that i probably shouldn't. for 8 years now. so far I've only noticed being off by a cent here and there. yep.. just don't care. will cost more than a penny to fix it now.


You've clearly never worked with accountants.


[Citation needed]


You've clearly never worked with accountants. [1]

[1] https://news.ycombinator.com/item?id=34717487


I don't know what I expected


Depends what you're doing with them. A value at risk calculation gives you a currency-dimensioned result that's hard to represent exactly, for example.


Yes, a more precise statement is "exact currency amounts form a group under addition" - once you depart from group operations (e.g. multiplication by non-integral scalars) you can start to consider floats.


Yes

But actually, since fixed point decimals are not all ways available use the smallest unit of account, not smallest legal tender, and integer arithmetic

And learn how to round


Smallest unit also has issues. For instance the Indonesian rupiah technically is made up of 100 sen, but the currency is so inflated nobody uses it and currency libraries behave differently (even different versions of the same library). We had a bug where different OS versions provided different values when normalizing it to a smallest unit integer.

If you really don’t have access to a decimal type I think the best solution is to convert it to micro-units (price * 10^6). This is what Android does in its billing library.


Use sen. It can be transferred by bank transfer.

What people use as cash is a distraction, generally


I've said this before, but I wish python and other high level languages defaulted to decimal for the literals, and made float something you had to do explicitly. My reasoning behind this is that floating point math is almost always an implementation detail, instead of what you're actually trying to do. Sure, decimal would be slower, but forcing people to use float as an optimization would remind them to mitigate the risks with rounding or whatever.


There is a very big catch---many if not most mathematical functions won't be exact anyway, so you have to round at some decimal places. Python does this with its `decimal` module: the number of fractional digits is literally a part of the global state [1]. While this allows for more concrete control over rounding, assuming that there was no such control, it turns out that the choice of radixes doesn't matter that much.

[1] https://docs.python.org/3/library/decimal.html#context-objec...


> many if not most mathematical functions won't be exact anyway

That's actually a really interesting question - while this is obviously true for most functions which (in a mathematical sense) exist, I wonder if it's true for "all functions weighted by their use in computing applications"? That is - do boring old "addition, subtraction, and multiplication of integers" outweigh division, trigonometrics, etc.?

In 3D modelling/video games, almost certainly not. In accounting software...probably? Across the whole universe of programs: who could say?


You're missing the other major problem, which is that range is mutually-exclusive with precision. The scientific community discovered a long time ago that exponential notation is the superior way to represent both for very large and very small values because the mantissa is shifted to the place where precision is needed most.

>In 3D modelling/video games, almost certainly not.

A 32-bit integer divided into a 16-bit whole and a 16-bit fraction would be limited to only representing values between -32768 and 32767 while also having worse precision than a 32-bit ieee std754 floating-point at values near 0.

>In accounting software...probably?

Representing money in terms of cents instead of dollars removes the need for real-numbers entirely outside of "Office Space" scenarios where tracking fractions of cents over millions of transactions adds up to tangible amounts of money.

>Across the whole universe of programs: who could say?

Most computer programs don't need real numbers of any sort, and the ones that do need to be written by people who understand basic mathematical concepts like precision.


> A 32-bit integer divided into a 16-bit whole and a 16-bit fraction would be limited to only representing values between -32768 and 32767 while also having worse precision than a 32-bit ieee std754 floating-point at values near 0

...OK? I'm not sure how that relates to my supposition that trigonometric operations are likely to be more common in 3D modelling cases. I'm not arguing for or against any particular representation of numbers therein.

> Representing money in terms of cents instead of dollars removes the need for real-numbers entirely

I wasn't imagining dollars-and-cents, but rather rates - X per Y, the most natural way in which division arises in real life.

> Most computer programs don't need real numbers of any sort...

You're again arguing against a case I'm not making. I'm not making any claims about the necessity (or otherwise) of real numbers in programs, but simply wondering about the prevalence of particular operations.

> and the ones that do need to be written by people who understand basic mathematical concepts like precision.

A snide insult motivated by your own misunderstanding of my point. I understand precision, and it's irrelevant to my point.


>A snide insult motivated by your own misunderstanding of my point. I understand precision, and it's irrelevant to my point.

jeez talk about arguing against cases i didnt make, i never said you dont understand precision.


I had thought this was also to do with gpu physics at some level of precision in the float it matters so one machine configuration will not precisely match another machine doing the same calculation. Or something like that.


Normally I would say that it is hard to tell, because it is. But I think in this particular case I have a reasonable argument---back in 2014 when Python added a support for matrix multiplication operator `@`, the proposal author did survey and made a case for it [1]. And you can see that an exponentiation operator `**` is actually used more than division `/` even in non-scientific usages. And as you've guessed, exponentiation won't be exact if its exponent is negative.

[1] https://peps.python.org/pep-0465/#so-is-good-for-matrix-form...


Interesting data, thanks!

Since addition, subtraction, multiplication, and modulus are each used more than division and exponentiation _combined_ (and since not every use of those last two functions would result in an "inexact" result), I think we can pretty clearly conclude that "most usages of mathematical operators in these libraries will result in an 'exact' result" (I'm hand-waving on the definition of "exact", I don't think it's at issue here)

Which is not, of course, a good justification for ceasing to worry about the problem, since a) those packages might not be representative of all libraries, and b) a small proportion of uses might result in a disproportionate amount of bugs.


> a small proportion of uses might result in a disproportionate amount of bugs.

This is what concerns me. Sure, using decimal floating point solves 0.1 + 0.2 = 0.3 (which I can't imagine ever writing in real code). But if you get used to that, then you start to expect 0.1*x + 0.2*x to be 0.3*x, and depending what x is this may or may not be true. Maybe it works for all of your test cases (because your test cases are things like 2 and 10^-4), but then you accept some user input and start getting weird bugs (or infinite loops). There is no good solution besides expecting and preparing for rounding error.


> Maybe it works for all of your test cases, but

This reminded me of an article I read awhile ago, probably here on HN:

https://randomascii.wordpress.com/2014/01/27/theres-only-fou...


I'm trying to find the quote your talking about, but I just see a comparison between stdlib, scikit-learn, and nipy. And for the import stats it is just what was on github in 2014. I think that it is safe to say that most code is not publicly available on github.

Though regardless of usage, I think that people doing stuff that needs floats are more likely to understand why they need them, and have the ability to use them explicitly without much issue. By using python, and most other high level languages, we're already making sacrifices to make things easier to use and understand, and in Python specifically we're told that explicit is better than implicit, except for this.


> exponentiation won't be exact if its exponent is negative

Sure, but how common is that? Exponentiation is almost always positive when I've seen it.


Rounding is fine but it would be nice if 0.1+0.2 was predictably 0.3. I am having a lot of trouble explaining to people that float numbers should be avoided unless you really need them. I have seen code that stored versions as floats and the dev was surprised that version 1.1 wasn't always equal to "1.1".


> I have seen code that stored versions as floats and the dev was surprised that version 1.1 wasn't always equal to "1.1".

And will break when the version reaches 1.10. While I agree we need a better way to teach this (e.g. inexact-exact distinction as in Scheme or more recently Pyret), that's as problematic as storing a telephone number as an integer (or worse, a FP number).


Totally agree that storing a version in a float is stupid but that's where we are :-(


Wouldn't seriously suggest doing this, but rationals with big integers would have exact results for all the common operations


I mean, cosine is pretty common...

The next level solution is to apply generators so that either the decimal stream or the continued fraction is allowed to be infinitely precise, but I think this can have dangerous effects where checking whether a number is equal to 0 or maybe 1 can involve infinite computation? So that's where you really understand “oh, I do really need that epsilon, for comparisons’ sake.”

For continued fractions I think you can also just have your library bound the size of the integers involved? So “it’s an array of signed int32s, but if your continued fraction generates a number that would overflow that, we just truncate the stream at that point.” Then the library is able to say that these two things are equal because their difference is [0; int_overflow] which becomes just [0]. Something like that.


just yesterday someone commented about https://fredrikj.net/calcium/index.html

which is pretty darn amazing. pi and e are essentially first class, but a lot of transcendentals aren't. seems like a really neat approach.


Calcium is amazing and so is exact real arithmetic or constructive real number, but they all can't avoid practically undecidable inputs. (Algebraic numbers as in Calcium can be made decidable, but they still can take an unreasonable amount of time to compute. Calcium does answer "unknown" for those cases.)


I started out using the Haskell “Rational” type [1] (which is exactly what you mention) for https://cryptomarketdepth.com/ but I had to abandon it because it was horribly slow. I was multiplying numbers with roughly 8 decimal places, and once I had done this like 100 times my program spent almost all its time trying to simplify fractions with a 1000 digit numerator and denominator.

[1] https://www.stackage.org/haddock/lts-20.10/base-4.16.4.0/Pre...


This is indeed the reason that Python didn't (initially) have rational numbers while its spiritual predecessor ABC had. [1]

[1] https://python-history.blogspot.com/2009/03/problem-with-int...


Very interesting. Thank you for sharing this.


Yup, it's a terrible idea, even if theoretically possible


And that's fine! People directly deal with math in decimal context, so we already have some expectations about how rounding etc works. So long as decimal type and its operations follow those expectations, they'll cope with it. The problem with binary is that these expectations don't translate for some of the most basic stuff.


Honestly, I'd just be happy with first class language support for decimals at all.

For example, I'm a huge fan of TypeScript, but it is hamstrung by the fact that javascript only supports a single `number` type (and, recently, `bigint`). Worse is the effect that since JSON is derived from javascript, it also has no built-in decimal type. So what happens inevitably when you want to represent stuff like money:

1. First people start using plain numbers, then they eventually hit the issues like this post.

2. Then they have to decide how they will represent decimals in things like APIs. Decimal string? Integers that represent pennies or some fraction of pennies?

3. Also, pretty much all databases support decimals natively, so then you get into this weird mash of how to not lose precision when transferring data to and from the DB.

Overall it's just definitely one of those issues that programmers hit and rediscover again and again and again. I'm surprised there hasn't been more movement towards a better language-level solution for the post popular language in use worldwide.



Thanks, it's been forever since I've used C# so glad to know this exists, and seems like the ideal implementation.

Really wish JS had added first class support for a bigdecimal class before bigint. After all, the first is basically a superset of the latter.


Came here to make this exact same comment, including the "I love TS but wish it wasn't built on JS" sentiment. I explored Rust recently and was disappointed to see there's no stdlib Decimal, but instead there are multiple community implementations - so I'd have to sort through and vet the right one.


It's surprising that none of the popular high level languages that borrow so much else from Lisp haven't borrowed its rational number type. Really the whole numeric tower makes a ton of sense, and you can always declare floats if needed.


Thanks for the encouragement to look up lisp's numeric tower (https://en.wikipedia.org/wiki/Numerical_tower), that was interesting to compare to the languages I'm more familiar with.


What about Julia? It's somewhat popular and heavily inspired by Lisp. It has a type tree that's reminiscent of Lisp's numerical tower: https://global.discourse-cdn.com/business5/uploads/julialang...


Rationals are not really great for a long series of calculations. E.g. imagine taking the average of a bunch of numbers like 1.15542345089. It will blow up your CPU usage to stratospheric levels.


Including Lisp. Not every Lisp dialect has rationals.


Unless you're dealing with billions and care about pennies then using float for money is fine in 99% of cases.


That's absolutely, 100% false. Trivial example (I've actually hit an analogous bug in production): User has money in their wallet, and you want to check before they make withdrawals that their balance doesn't go negative. You have some logic in your code that is basically like:

    if (walletBalance - sumOfWithdrawals < 0) { throw new Error('overdrawn); }
Try that code in Javascript where walletBalance = 0.3 and the sumOfWithdrawals = 0.2 + 0.1.

Point being there are tons of operations in the financial world where you check things against 0, or want to ensure that a breakdown of smaller transactions equals a larger amount. Those all can fail with floating points but succeed with decimals.


That's only true if you know that you have to round everything to the second digit. If you have a user calculating the total cost of buying something that is $7.10 and something that is $10.20, you don't want to show them $17.299999999999997. You would be better off storing everything as an integer of pennies and just displaying the dot in the frontend. To be fair, I also think that high level languages should come with types for all the major currencies.


I'm an expert programmer and I agree with downvoted IshKebab. This is just HN having a Reddit moment.

IEEE 64 bit floats are accurate to just past 15 decimal digits. For ordinary monetary amounts, the exact figure in cents is approximated with ridiculous precision. If you rub two pennies together, you are likely causing more of a difference in the amount of copper than the IEEE 64 bit float causes in the value.

You have to do many, many additions and multiplications before you get a result which has accumulated so much error that it is now closer to the wrong penny. E.g. if you don't deal with dollar amounts more than 7 figures, you have about 8 places past the decimal point; you need something like a 6 place error before the penny is affected.

You can counteract this problem by correcting intermediate results to that floating-point value which is closest to exact penny result. In other words, throughout your calculation, you truncate away the difference between the result, and the best approximation of the dollar and cent value.

Within this framework, you can implement all required rounding rules, too. You can take a floating-point result representing a fraction of a penny and round it according to banker's rule to the penny.

Of course, you can't just ignore the issue and just blindly use floating-point for money in a serious accounting system; but that's a strawman version of using floating-point for money.

Also, Microsoft Excel uses floating point. See here:

https://learn.microsoft.com/en-us/office/troubleshoot/excel/...

Vast armies of people rely on Excel for financial calculations.


Since you're an expert, I'm sure you won't mind these corrections:

> IEEE 64 bit floats

That should be IEEE 64-bit floating-point. The name `float` is a 32-bit floating-point.

> are accurate to just past 15 decimal digits.

FOR SOME RANGE. They're called floating-point because they can modify how much scaling is devoted to either side of the decimal point. The more scale it has to devote to representing whole portion, the lower its precision in the fractional portion. At some point, a 64-bit floating-point value cannot represent any decimal digits. Past around 2^53, a `double` 64-bit floating-point value cannot even represent every whole number, because at that point the available precision is 2 or more.


A "float" is 32 bit floating-point in the C language, yes.

If I say "64 bit float", it's obviously not that one.

Here is CLISP:

  [1]> (type-of 3.0)
  SINGLE-FLOAT
  [2]> (type-of 3.0d0)
  DOUBLE-FLOAT
Python3:

  >>> type(3.0)
  <class 'float'>
  >>> type(3.0e300)
  <class 'float'>
Looks like it's calling the 64 bit ones just float.

Rust has f32 and f64. Some historic languages have used type names like REAL and DOUBLE and others.

"IEEE 64 bit float" is almost unambiguous; it could be the binary one or the decimal one.

IEEE uses identifiers like binary64, decimal32, if you want to be pedantic, and not float and double.

> FOR SOME RANGE ...

Your comment is seems to be based on the wrong idea of what the number of digits means. An IEEE 64 bit binary float stores 15 significant digits without loss. In that C language that gives us the float type, the preprocessor constant DBL_DIG has a value of 15 (on IEEE floating point platforms).

This is a 15 digit number using E notation (in terms of digits of precision / significant figures):

  1.23456789012345E-20
So is this; it is not a 16 digit number:

  123456789012345.0
And so is this isn't a 17 digit one:

  12345678901234500.0 (same as 1.23456789012345E16)
You have to write the number in exponentatial notation, and chop the trailing zeros. Then count the remaining number number of digits in the mantissa part.

I think it's only for subnormal numbers that the 15 rule breaks down; but I might have mentioned that. These are special representations close to zero beyond what is reachable with the regular exponent and mantissa, which have the benefit of certain desirable behaviors in underflow situations.


I don't think experts like pointless and incorrect pedantry anymore than anyone else. Well maybe a little more because they can easily put you in your place. :)


> You have to do many, many additions and multiplications before you get a result which has accumulated so much error that it is now closer to the wrong penny

> You can counteract this problem by correcting intermediate results to that floating-point value which is closest to exact penny result

The argument seems to be "floats are okay, as long as you're careful", but forgetting to round the number in between a large number of operations is a probable mistake.

Using decimals would make such a mistake impossible.


Even with a decimal representation, there are situations where you have to remember to round to a cent.

Decimals are software libs; what could go wrong?

More people use the floating-point instructions of a popular CPU than any given decimal library.

If you're starting from scratch, it's probably a lot less work to write (test and debug) a Money class based on floating-point, whose overloaded math operations do the required rounding (so that code using the class cannot forget) than to make a Money class based on decimal arithmetic.

(The last time I wrote an accounting system, I made a Money class based on integers. It could be retargeted to other representations easily. I could make the change and compare the ledger totals and other reports to see if there is a difference.)


> Decimals are software libs; what could go wrong?

Bugs in libraries do exist, but it's much easier to fix a bug in one place, than to track down every single line where floating point operations could misbehave.


> More people use the floating-point instructions of a popular CPU than any given decimal library.

Perhaps, but how many of the former are in a position to notice accuracy issues? I have more faith in a reputable decimal library than your average FPU, frankly.


You can use floating-point for money even if you're dealing with (American) billions (10 figures), and care about pennies. With 10 figures in the integer part, you have 5 more digits of precision in the fractional part, so down to the thousandths of a cent. A single addition or multiplication will not accumulate an error which affects the cent, and you can round the calculation to the best approximation of the penny in order to clip off the error.


JSON specifies the grammar of numbers as tokens, but not the behavior of how they should be parsed. Implementors could choose to parse numbers as decimals without violating the spec.


I agree with you, but there's theory and then there is reality. The JSON spec is famously "underspecified" in that it pretty much ONLY specifies the token grammar but nothing with respect to interpretation, and hence there are lots of areas which have been problematic for years - the spec even says this with respect to object keys:

> The JSON syntax does not impose any restrictions on the strings used as names, does not require that name strings be unique, and does not assign any significance to the ordering of name/value pairs. These are all semantic considerations that may be defined by JSON processors or in specifications defining specific uses of JSON for data interchange.

So, in reality, the JSON "spec" is really how the most popular implementations interpret it. I'm not aware of a single implementation (though I could most definitely be wrong) that interprets number tokens as anything but floats/doubles by default.


In practice, it is usually easier to use the parser that came with your language(s) and figure out a different way to encode the values that are causing trouble, instead of writing a new parser.


And then you get “why isn’t 3 × ⅓ equal to 1?” and similar questions. “Use rationals” would only postpone the issue to “Why isn’t (√2)² equal to 2?” and similar questions.

I would think that, nowadays, every child would learn that calculators do not always produce exact answers almost in kindergarten.

Also (nitpick), it’s not “float vs decimal”. “Floating vs fixed point” and “binary vs decimal” are orthogonal issues.


Yes, children do learn that. But on the calculator, they're inputting numbers in decimal, and it's decimal internally. In programming, we input numbers in decimal, and even write them that way in source code, but the actual math is all binary - thus, there's a disconnect between the common sense expectation of what (0.1 + 0.2) ought to do, and what it actually does. Someone coming from a calculator would not expect that to be unequal to 0.3, unlike the situation with square roots.


> I would think that, nowadays, every child would learn that calculators do not always produce exact answers almost in kindergarten.

such a strange comment to make. the number of people that would ever bump into this situation is so small. like the difference of .1 + .2 = .3 and .30000000000000004

i just used my iPhone to do (√2)² and it displayed 2 as the result. same for 3 x 1/3 to receive an answer of 1. i can only assume that the default android calculator app would behave the same. between those 2 apps, we've probably covered the default calculator for the majority of people.

gotta break out of the HN is the world shell, and realize the majority of people do not suffer the same issues you might deal with on a daily basis.


This smells like a good fit for Haskell, since computation is deferred until a result is demanded. I haven't tried it but I can imagine an implementation of, for example, division that would do its best to keep the numerator and denominator intact in their original formats until forced to kick out a value.

(My Haskell-fu isn't deep, but I suspect it would even be possible to write it so that, for example, multiplication of two division operation expressions multiplied the numerators together instead of doing divide -> divide -> multiply...).


There's Data.Ratio which represents fractions by their numerator and denominator in lowest terms:

    $ ghci
    GHCi, version 8.10.7: https://www.haskell.org/ghc/  :? for help
    Prelude> :m +Data.Ratio
    Prelude Data.Ratio> :t (%)
    (%) :: Integral a => a -> a -> Ratio a
    Prelude Data.Ratio> 18 % 21
    6 % 7
    Prelude Data.Ratio> 1%10 + 2%10
    3 % 10
There's even Data.CReal for working with the computable reals:

    Prelude> :m +Data.CReal Data.Complex
    Prelude Data.CReal Data.Complex> let i = 0 :+ 1
    Prelude Data.CReal Data.Complex> exp (i * pi) + 1 :: Complex (CReal 0)
    0 :+ 0


You could overload operators like + and / so they evaluate to an AST representing the calculation rather than the actual calculated value.

Then you need to do some analysis on your AST to try and restructure it in a way that preserves as much precision as possible.

You don’t really need Haskell for this though, in theory it will work in any language (but more ergonomically if you have operator overloading).


Raku interprets decimal literals (like 0.1) as limited-precision rational numbers (Rats) [0-1].

I think this is a pretty user-friendly compromise.

[0] https://docs.raku.org/syntax/Number%20literals

[1] https://docs.raku.org/type/Rat


Actually, the limited precision is the default (caused by the numerator as well as the denominator having to fit in a 64-bit integer).

By default, when it not longer fits, it will convert to floating point (called Num in Raku). But this behaviour can be set: another alternative is for the numerator and denominator both being big integers. This gives you infinite precision rational numbers, at the expense of potentially needing infinite CPU and/or RAM.


Common Lisp defaults to ratios of integers for all precise calculations, which is nice other than ending up with results like 103571/20347, which is not obviously "slightly more than 5" the way that 5.090234432594485 is. It does have the advantage over decimals that e.g 1/3 can be represented precisely.


I like rational numbers in general, but they do have some huge practical issues in numerical algorithms. In particular, there's no upper bound on the memory use of a rational based on its magnitude. Following from that, there's no lower bound on the time an arithmetic operation may take based on the magnitudes of the operands. When you're doing hundreds of thousands of operations on an accumulator, this can go very wrong.

So I caution against blind preference for rational representations as well. You really have to choose your numeric representation based on your use case. It's unfortunate that this can be so hard to control precisely in many programming languages.


Yup; all representations of numbers have tradeoffs. Fixed-sized Integers, log-scaled numbers, and floats all have finite precision. Everything else requires variable space and/or time.


This put my daughter off of programming. When she was 7, I showed her how to use python in immediate mode, and she got it without difficulty. She even understood variables. Then one day she wanted to add prices, and she got one of these errors, and she never wanted anything to do with it again.


I still remember writing something in high school along the lines of:

  i=0  
  while(i<1):  
      <something with i>  
      i=i+.1
And spending hours trying to figure out why it ran an extra iteration, and this was early enough it wasn't easily googleable. Whatever I was doing with i needed it to be .1, .2, .3... and thought I was being clever not doing 1...10 and dividing by 10 every iteration within the loop. I think there was also a weird language quirk with whatever I was using that a print(i) rounded to a handful of decimal places so it looked fine while debugging it.

Very frustrating, but in retrospect very eye opening.


Okay, I'll defend floating point numbers. The choice of floating point over decimal represents the choice of science over money. In science, it's more important to have a number system that represents everything from the infinitely small to the infinitely large, rather than one that has perfect precision. Because in nature, perfect precision does not exist. It doesn't matter what pi is to perfect accuracy because there are no perfectly round circles in reality. Only in money and mathematics do people really care about perfect precision. In the real world, precision is negotiable.

I think that's a good lesson for kids.


Yes, but actually no. Precision becomes important once you start digging in. Calculate the GPS time dilation without sufficient precision and you'll be in trouble. Go down to quantum physics to discover that the exact ratio of mass between the electron and proton might matter for your nuclear reactor.


You don't even need to get that specific, even web developers encounter this sooner or later, often as a UI bug in what should be really simple math. Then one day you wonder "what the fuck are all these zeroes? Oooooh..."

That's how I learned about it years ago.


It's understandable - you trust a tool like a calculator to give you the right answer. If it sometimes makes mistakes and you have to check each answer by hand, it isn't really saving you any time.

To many, a rounding error makes the answer "wrong", and suddenly the tool has switched from a reliable one into an untrustworthy one.


> you trust a tool like a calculator to give you the right answer.

By middle school, kids should have learned that you can't trust calculators. There are all sorts of numbers like pi, e, sqrt(2) that are impossible to represent. Once you start getting into trig, you have to accept rounding.


Sure, but .1 is definitely representable, so they can be excused for finding it a little unreasonable that .1+.1+.1+.1+.1+.1+.1+.1+.1+.1 doesn't equal 1 in many languages.

Explaining _why_ .1 isn't representable requires explaining IEEE-754 and explaining _that_ requires an understanding of binary numeric representation.

I teach college students who find this confusing, so I think it's fair that the average person finds floating point behavior confusing (in fact, I've had to explain to Physics Professors doing computation simulation work why their 1-<tiny number> isn't working out the way they expect -- though they initially tried using double doubles to get around the problem).


This does depend a bit on the calculator. embedded_hiker's anecdote has made me update in the direction of exposing my daughter to Wolfram Alpha before Python...


My AP calc teacher was an expert on writing tests that would trigger calculators into approximation mode. Pretty much every homework problem, the calculator could easily do an exact answer. But you better have learned, because come test time, the best your calculator is going to offer is 0.942858934759084...


> I wish python and other high level languages defaulted to decimal for the literals, and made float something you had to do explicitly.

When Python originally made the choice to have literals with decimal points in them be floats, the language did not have a decimal implementation, so floats were the only choice.

I don't know if anyone has proposed changing the default now that Python does have a decimal implementation, but I suspect that such a proposal would be rejected by the Python developers as breaking too much existing code.

What would be almost as nice, and would be backwards compatible, would be introducing a more compact way to declare decimal literals, something like "0.1d".


You are conflating fixed point vs floating point and decimal vs binary, which are entirely different things. You can have decimal floating point numbers and binary fixed point.

You realize that decimal numbers also have the exact same types of rounding issues as binary numbers right? The only difference is that the former allows you to divide by both 2 and 5 cleanly, whereas the latter only lets you divide by powers of 2. If you want to divide by 3, 7, 11, or compute a square root or an exponent, using decimals is not going to save you from having to reason about rounding.


Would a "fixed-point binary-coded decimal" type be a solution here? With 64 bit values that gives you 16 digits to play with, which for "everyday numbers" seems like plenty.


IEEE double gives you 15 digits and a much larger dynamic range, so the tradeoff is just not worth it except for specialized applications.


I just had a horrible idea. Decimal is commonly used with fixed point (no speed penalty), whereas binary is commonly used with floating point.

But what if... what if they had a bastard child? What if we moved the point a fixed distance in decimal... and also a floating distance in binary?

The value represented would then be sign * mantissa * 2^exponent * 10^bias

With a bias of -6, you could represent every multiple of 0.000001 up to 9 billion if I did the math correctly.


> Sure, decimal would be slower.

Would it? I thought dealing with integers - a value, in binary - would be faster than floats - a value in binary, a decimal places value, whatever odd logic there is required to hide the leaky abstraction.

Edit: nevermind. Since the conversation was 'decimal versus float' I thought 'decimal' meant integers without floating points.

If decimals means a decimal point, I think a better suggestion would be to use integers.


In terms of speed: int > float > decimal. Depends on the hardware type. On GPUs, float > int. However the performance difference is negligible for many, many use cases so I generally use decimal as the default and only use float if absolutely necessary.


Once you are doing microservices and serving any customer call requires 3 http requests and converting everything into JSON and back every time, it doesnt matter if you use float or int, CPU, GPU, a microcontroller or even abacus by hand.


Err it really depends on what you're doing. Integer division and modulo is still not fast.


Define "fast" please.


It really depends! Floats are /really/ fast for many operations that are really slow on ints.

On modern CPUs, it's faster to cast a number to double, do a square root and cast back to int, than even the cleverest bithacking int algorithm.


> a value in binary, a decimal places value, whatever odd logic there is required to hide the leaky abstraction.

Ironically, this is a much better description of `decimal` than of `float`.

IEEE float math is done in hardware. It is "one value in binary" that is added, subtracted, multiplied, etc etc with electric circuits.

The decimal abstraction requires manually keeping track of the number of significant digits, converting back and forth so that two different decimals can be added / multiplied, etc etc. There's a lot more that has to happen besides asking the CPU to do a single operation.


I do hope we will get a hardware implementation of decimal now that chipmakers dont k ow what to do with the extra transistors and keep coming up with new vector instructions, that most developers dont k ow how to ise and most languages don't even supoort


Which decimal? Fixed point, or floating point? Should the numerator and denominator be given the same bit width? How do you deal with overflow and underflow?

Its easy to think of "decimal" as one thing because every language provides a library called `decimal`, but there are a million subtle decisions and tradeoffs to make when choosing one standard binary representation. Most languages don't have a binary representation at all, and implement `decimal` as a high level abstraction with regular integers.


We managed to standardize on a single binary floating point representation in practice, and even if it's not perfect, the benefit from such standardization makes it worthwhile.


Pick a solution and make a decision just like it was done with every other format. IEEae has defined one, I believe it was mentioned above


a) You want rationals, not "decimals". Limiting yourself to denominators of powers of 10 is utterly stupid if you have the chance to implement a proper number type stack.

b) Floats are efficient approximations of real numbers. Trigonometry and logarithms are vastly more important than having the numbers be printed pretty, so defaulting to rationals instead of reals is quite insane.


> Trigonometry and logarithms are vastly more important than having the numbers be printed pretty

Citation needed. I suspect more people are doing accountancy than physics simulations.


I would love that.

    f = float(closest_to=0.1)
You can't really mess up programmer expectations like this.


fixed point numbers!


You can already use fixed-point (AKA "decimal") values in any language which supports integer artihmetic, but you will quickly discover the two major limitations it has: your programs still need to account for precision, and the range of values which can be expressed becomes smaller as precision increases.


Do processors accelerate decimal/fixed point? I know some have offered this in the past but I’m not current on instruction sets for accelerated maths. My guess is a lot more energy goes into floating point and integer.


Not at far as I know. That would require everyone to agree on one binary representation, which hasn't happened. There are tons of different fixed-point implementations out there, each with different tradeoffs. Choosing one implementation and getting all languages to implement it (so that CPU makers would bother accelerating it) would be a herculean task, IMO.


IEEE 754 actually does specify a decimal floating point format since 2008, but I don’t think it’s widely implemented.


Do you have any more info on this? The text of IEEE 754 costs $100 and I can't find any reference to it on Google


IEEE 754-2008 combined the IEEE 754 with the IEEE 854 decimal float standard, so hence any post-2008 IEEE 754 version also contains decimal float (and IEEE 854 has been withdrawn).

But like the parent poster noted, hardware tends to not actually implement the decimal float parts (I mean, IEEE 754 doesn't care about how calculations are made or how fast they are, so a software emulation is perfectly acceptable from the standard perspective). I think IBM POWER has one of the rare HW implementations of decimal floats.


Thank you! With the hint of IEEE 854 I was able to find https://en.wikipedia.org/wiki/Decimal64_floating-point_forma...


I'm amazed you didn't write: "The text of IEEE 754 costs $100.00000000000003 and I can't find any reference to it on Google."

:)


He tried but he threw a NaN


Some IBM POWER and mainframe microarchitectures have hardware support for decimal floats.

Acceleration for binary fixed precision was (still is I guess) common in DSPs. Not decimal fixed though.

I lost track of which extensions Intel provides, but I wouldn't be surprised if something was available.


Intel processors have instructions for performing decimal addition and subtraction, with 2 decimal digits per byte, but they are only available in 32-bit mode, and were dropped in 64-bit mode, which I guess answers the question of whether anyone ever actually used them.

https://docs.oracle.com/cd/E19120-01/open.solaris/817-5477/e...


fixed point arithmetic is just integer arithmetic

(with a multiplication to enter and a division to exit)


I don't think they do, at least on x86-64 and arm



Is that why banks are known for using it?


More likely it's the banks that were computerizing early on. If you look at PLs that were available in late 60s / early 70s, COBOL is the one that's most optimized for CRUD and reports, which is largely what the banks wanted. And then once you already have it and it works, why change?


> [...] defaulted to decimal for the literals, [...]

Why not rational numbers?


Let's say you and two friends are buying a group ticket to some show. The ticket costs $100. Now try to find a way to pay exactly 1/3 each.

Money transfers tend to involve payments with fixed precisions. For cash, it's often rounded to the smallest available coin, for credit cards, you can add a few more digits, but it will still be fixed point.

When you settle a transaction, you typically need to pay exactly the amount defined by the transaction because some system is doing some check that the amount is exactly right.

So in my example above, if your friends are paying $33.33 you may have to pay $33.34 to make the transaction go through.

When you're writing systems involved in actual money transfers (or accounting) you tend to be better off using fixed point decimal, with predefined precision for each variable involved.

Within a calculation, it may be fine to use float, but the number going into ledgers or transactions need to have the predefined precision.


Yes, if you want to work with an existing system, you want to follow the conventions of that system.

Fixed point (ie calculate everything in cents) is a good convention for lots of things to do with money. But it's not a good convention to have as the default in a language like Python.

> Within a calculation, it may be fine to use float, but the number going into ledgers or transactions need to have the predefined precision.

I was not suggesting the use of float. I was suggesting using rational numbers by default. See eg https://docs.python.org/3/library/fractions.html


Because they have more confusing edge cases. Decimals getting rounded might be annoying, but it's at least understandable.


Ruby sort of does! The type of 1/3 is Rational!


I don’t understand how float would be an implementation detail and not the the thing you are trying to operate on. If a programmer uses a float they are most certainly wanting to use a float?


Because many programmers don't understand floats, even ones that have been doing it for years. Not to mention that higherlevel languages are being used by non programmers to script things out. I mean I recently helped oversee a class and heard someone tell people new to programming that floats are basically decimals. I sounded like a pedantic jerk interrupting to explain the difference, and I'm sure none of them remembered.

But more to the point of your question, we use floating point math because that's what computers are good at, not because we want that for it's own sake. We want to figure out sales tax, or how long until our kid will need to buy new shoes, or what effect changing the speed limit had on the total number of accidents, or all kinds of other things that humans care about. Using a floating point representation may be the most efficient way to get some of those answers using the technology available, but it is just a step along the way, not what we actually want. That's what I mean by implementation detail.

Basically, I think that any literals typed into the interpreter should work the same way as a calculator. If you want something special for your implementation because it will work better or faster, then that should be explicit.


If you want to compute sales tax, you’re going to have to define a rounding behavior, too. A decimal type wouldn’t save you from this. The desired rounding behavior will vary by situation.

How your calculator handles rounding is itself an implementation detail. I have no idea what rules your calculator uses to round and it’s not in any standard. Does my calculator do the same thing? Unknowable.


I've always wondered if you use "round-toward-even" rules, or round-down rules when the fractional part of the sales tax amount is exactly 1/2 a cent. But have never wondered hard enough to actually find out.


The programmer _wants_ to operate on a real number; the mental/abstract model of whatever application they’re building almost certainly involves real numbers instead of floats.

It’s the conversation from an abstract model to a concrete instantiation where floats are used, generally out of necessity or ignorance.

The fewer details needed to do this conversion, the easier it is to develop programs. When I say easier, I mean it’s faster AND less buggy — since the conversation often involves introduces errors, subtleties, and logic not present in the abstract model.


Parent commenter is saying that in many cases when you write something like

    let foo = 0.1 + 0.2;
the vast majority of the time people want 0.1 and 0.2 to be decimals, not floats, so they should default to that.


This is an insanely bad idea. You think Python is slow now, wait’ll you see it after this “improvement”.


Python is slow because it does a lot of dynamic dispatch, which dwarfs the cost of actual operations such as addition. So it's the other way around - Python, of all things, could probably switch to decimal by default without a significant slowdown.

What would be much slower is all the native code that Python apps use for bulk math, such as numpy. And that is because we don't have decimal floating point implemented in hardware, not because of some fundamental limitation. If decimals were more popular in programming, I'm sure the CPUs would quickly start providing optimized instructions for them, much like BCD was handled when it was popular.


Eh, maybe. I don’t think many NumPy people would reach for a decimal dtype even if it were available and implemented in hardware. It just isn’t useful in the primary domains where NumPy is used.

That leaves vanilla Python. You’re right that compared to dynamic dispatch, a decimal op implemented in hardware is nothing, but the issue here (IMO) is death by a thousand cuts. Python is already slow as hell, and currently using decimal as a default would require a software implementation in many if not all places, which will be slow. Trade off does not seem worth it to me.



Read the other responses here, and the “Alternatives” section of the article you posted. I am very happy the default is not what you just suggested.


Sorry, my comment was about it being slow, not about it being a bad idea: It is a bad idea, but it will not really make your python even slower than it is today. The Intel supports most of ieee754-2008 which has most of what you would need.


Judging from other comments here, it is not clear how widely supported IEEE754-2008 is. If it isn’t supported everywhere, it would make a VERY bad default for a numeric type.

It also appears that the logic for implemented something like this standard is indeed slower than standard IEEE754. Even if it’s only a bit slower, seems bad to make it the default.

All this just to fix something which is confusing to a novice programmer… and this is leaving aside the additional complications a fixed width decimal has which a floating point type doesn’t have.


It's fully supported. However, some parts only via a library. Anyway, to drill down my argument. It doesn't matter how slow or fast it is: python's floats ints,... are boxed. Your performance is down the drain before you even start looking at the value at hand. If you want any kind of performance at all while crunching numbers you will not be doing it in Python (which is exactly what happens when you use things like numpy)


I really like this quote from the article as way of explaining this whole perceived anomaly:

> To me, 0.1000000000000000055511151231257827021181583404541015625 + 0.200000000000000011102230246251565404236316680908203125 = 0.3000000000000000444089209850062616169452667236328125 feels less surprising than 0.1 + 0.2 = 0.30000000000000004.


And in my opinion, this is why `0.1 + 0.2 = 0.30000000000000004` is a bad meme. It cements a very wrong perception about floating point numbers. If we denote a rounding operation as `f64(...)`, this is `f64(f64(0.1) + f64(0.2)) = f64(0.30000000000000004) != f64(0.1 + 0.2)` which can obviously happen. In particular, `f64(0.1) != 0.1` etc. but we happened to choose 0.1 as a representative for `f64(0.1)` for various reasons. Nothing inaccurate, nothing meme-worthy, just implied operations.


Yes, this is the best way to explain it. Your number literals are “snapping to a grid” that is not base-10 and then we choose the shortest base-10 decimal that snaps to the appropriate grid point when we stringify the number.

The other thing that I would mention is that I see some really gnarly workarounds to try to get around this... Just bump up to integers for a second! People have this mistaken idea that the best way to understand these rounding “errors” is that floating point is just unpredictably noisy for everything, and that's not true.

Floating point has an exact representation of all integers up to 2^53 – 1. If you are dealing with dollars and cents that clients are getting billed or whatever, okay, the best thing to do is to have a decimal library. But if you don't have a decimal library and it's just some in-game currency that you don't want to get these gnarly decimals on, 3/10 will always give 0.3. 4/100 will always give 0.04. Just use the fact that the integer arithmetic is exact: multiply by the base, round to nearest integer, do your math, and then divide out the base in the end: and you'll be good.


A reasonable, but not always available, choice is to use integer quantities of the divided quantity. If you're dollars and cents, express things in cents. If you need tenths of a cent, express things in milliDollars. If you need 1/8th dollars, use those. Have a conversion to pretty values when displayed.

Sometimes you really do need to have a pretty good estimate of pi dollars, but often not.


what i haven't figured out is multicurrency. "cents" is fine for dollars but 1/100 doesn't work for all currencies. do you use a different denominator for each currency or standardize on 1/1e8 or something?


I'd use a different denominator per currency? You've got to keep track of the currency anyway, so have 1 USDCent, or 1 EURCent or 1 BHDFil (1/1000) or 1 GBPPence (1/100) or the historic 1 GBPFarthing (1/960 ??)

If the wikipedia article on Decimalisation[1] is complete and accurate, only Mauritania and Madagascar still have non-decimal currencies.

If you really needed it to be uniform, you could work in 1/1000th worldwide, as long as you didn't need to keep more decimals for other reasons.

[1] https://en.wikipedia.org/wiki/Decimalisation


And it's very bad UX that when you write "0.1" in the code or feed it to the standard string parser at runtime, what you get back is not actually 0.1. It's effectively silent data corruption. If you want "snapping to a grid" for perf reasons, it should be opt-in, not opt-out.


> If you are dealing with dollars and cents that clients are getting billed or whatever, okay, the best thing to do is to have a decimal library.

C# has decimal in the base library. We are doing a new project with financial data, and decided we willvhave everything in decimal - no floats at all.

There is no point of dealing with these issues to save irrelevant amount of CPU


The meme worthy thing is thinking that, given 3 sigfig inputs, 20 sigfigs is more desirable than 3. It's failing to distinguish noise from data.


I wonder how much confusion could have been avoided if compilers/interpreters emitted warnings for inexact float literals. It is bit surpirising pitfall, you generally expect the value of a literal to be obvious, but with floats its almost unpredictable. Similarly functions like strtod/atof could have some flags/return values indicating/preventing inexact conversions. Instead we ended up on this weird situation where values are quietly converted to something that is close to the desired value


Why bother? Almost every float literal is inexact. floats are inexact by design. That's why you can fit over 2^(2^53) values into 64 bits.

And compiler can't help you on the application UI layer.


> you generally expect the value of a literal to be obvious, but with floats its almost unpredictable

Fractions in positional notations are not exact as a rule. There are some exceptions, but mostly they are not exact. 1/3, 1/6, 1/7, 1/9 cannot be represented by decimals exactly (or they can, but using infinite amount of digits in their representation). There are exceptions of course, for example for decimals you need denominator with no prime factors except 2 and 5. For binary it can be only 2.


From a deleted comment I liked here from @stabbles:

> some things can be represented in finite digits in base x but require infinite digits in base y.

Very good summary. Binary to decimal is very straightforward until fractions require infinite digits. I don’t think dec64[1] is even a tradeoff—it’s just better. The significand stays a normal binary number— but it encodes the decimal point in gasp decimal. No infinities required for the numeric language that we all think in.

[1] https://en.wikipedia.org/wiki/Decimal64_floating-point_forma...


> it's just better

This assertion does not withstand scrutiny. You may like being able to get True as the result of 0.1 + 0.2 == 0.3, but the landscape will still be littered with rounding errors as soon as you try to do anything nontrivial. (Or even plenty of trivial things like expecting 1/6 + 1/6 to add to 1/3). So all you gain is a false sense of security in exchange for less precision and slower computation.

(Of course, there are plenty of tasks for which floats are just wrong for the job, and you should transform the problem so that you can use integers or rationals instead. For example, when you are incrementing a number by (integer multiples of a) fixed delta, just change units so you can count numbers of increments as an integer, and change units back at the end.)


> ... dec64 is ...

Not to be confused with Douglas Crockford's DEC64 [1], which I believe is worse than binary floating points.

[1] https://www.crockford.com/dec64.html


Oh, thank you! I actually think I’m going with Crockford on this one, and that’s what I meant to post.


In which case I disagree ;-). Most strikingly DEC64 doesn't do normalization, so comparison will be a nightmare (as you have to normalize in order to compare!). He tried to special-case integer-only arguments, which hides the fact that non-integer cases are much, much slower thanks to added branches and complexity. If DEC64 were going to be "the only number type" in future languages, it had to be much better than this.


Good points! I think decimal64 doesn’t normalize the significand also. But I can’t assess what I haven’t used. Mine is a snap judgment in favor of understanding more of dec64 vs the wiki article. My general feeling is that it’s time for the scale of computing to tip away from total correctness and efficiency, and more toward non-awkward interfaces. But at bottom, I would try both and then talk about it.


Mandatory link to the original "What Every Computer Scientist Should Know About Floating-Point Arithmetic" by David Goldberg over on https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.h...


That url must have the highest linked-to-read ratio on the internet. Although it seems to be fading away, linked to less frequently.


I once inherited the development and maintenance responsibilities on a financial forecasting/business development app. There were lots of expected value computations in the form of adding up potential revenues multiplied by their probability of win with costs subtracted (i.e. currency values). The previous developer had used floats everywhere resulting in runoff values as seen here.

I started using decimal in all of the new features in an effort to mitigate this but it resulted in even more headache as I now had a bunch of casts floating around on top of the truncation band-aids that I had to implement for existing lengthier calculations. My plan was to refactor the whole app and SQL schema but if memory serves I got pulled onto something more pressing before I had the chance.

This was especially disappointing to me because this was all implemented in C# and T-SQL which are languages with first-class support for decimal numbers. It wouldn't surprise me if the app is still in use today with some hapless dev halfway across the country whacking these bugs as they pop up.


This makes me wonder why Decimals aren’t part of std libs in Javascript or Golang.


It is hard to design. There should be some sort of precision and rounding mode control throughout calculation, meaning there should be some sort of implicit state or those controls should be sprayed into every operation. The latter is explicit but tedious [1]. The former will still surprise people from time to time. IEEE 754 binary number is sort of working well in this aspect.

[1] See QuickJS's BigDecimal extension for example: https://bellard.org/quickjs/jsbignum.html#Properties-of-the-...


that's eactly why it needs to be in standard library- so people dont make mistakes reinventing the wheel


Only if you have a proven design. I don't think we have any satisfactory design at all.


What's wrong with https://en.m.wikipedia.org/wiki/Decimal64_floating-point_for...

Or with C# 128 bit implementation of decimal?

Is there no good implementation anywhere?


I don’t agree with the title, but this article is great if you want to get to the very bottom of the floating point rabbit hole. https://people.cs.pitt.edu/~cho/cs1541/current/handouts/gold...

What Every Computer Scientist Should Know About Floating-Point Arithmetic

One thing worth mentioning is that the IEEE 754 floating point standard is implemented in hardware in most CPUs (microcontrollers might deviate) so if you learn this stuff you’ll be set for life, as in, it doesn’t depend on the programming language you’re using.


See also: Floating Point visually explained (https://fabiensanglard.net/floating_point_visually_explained...) — for those allergic to mathematic notations

There's also a stackoverflow thread: https://stackoverflow.com/q/588004




For everyone working on tech... When your friends mock you, and make funny comments how tech is hard and often unpleasant to use this a perfect example. This thread have multiple `techy` examples, how binary numbers works, etc. Thurth is the year is 2023 we have CPUs with billions of transistors and we still have fundamentaly basic bugs like math. I really don't understand why languages dont default to decimals for literals..


All number formats have tradeoffs, nothing can robustly represent all real numbers. At least floats are standardized and their probmems relatively well characterized and understood.


The problems with decimal rounding are even better understood and standardized, given that we've been working with decimals long before we had binary computers.


You realize that decimal rounding has the literal exact same type of problems as binary rounding right? It’s literally just a different base.


I do realize that. The point is that humans are familiar with the typical manifestations of those problems in decimal representation, because that's what we use day to day. Every kid who had a calculator to play with while they were bored knows what 1/3 does. They also know that 0.1 + 0.2 gives exactly 0.3.

The problem isn't binary itself, even. It's that our tools pretend that it's decimal, misleading people. If it looks like decimal, it should behave as one by default.


because some developers hate anything that makes thongs easier


This is great!

There was a post on /r/softwaregore recently where someone showed a progress bar on Steam that said "57/100 achievements! Game 56% complete" or something like that. I snarkily commented something about naively using floor() on floating point and then moved on.

But then I thought that that may not have been the problem, fired up emacs and wrote a C program basically just saying

  printf("%f\n", 57.0 / 100.0 * 100.0);
To my surprise this correctly gave 57.000000, but in python,

  57 / 100 * 100
Gave 56.999... anybody know what was up here? Different algorithms for printing fp?


The calculated number is indeed 56.99999999999999289457264239899814128875732421875 which is not same to 57 (IEEE 754 binary32 or binary64 can represent this integer exactly). You can verify this with the following:

    printf("%f\n", 57.0 / 100.0 * 100.0 - 57.0);
It just happens that `%f` in C defaults to 6 fractional digits.


Thank you!


The f in %f doesn't mean "floating-point"; it means "fixed digits": printing the value in the style -dddd.dddd. If you don't specify how many digits after the decimal point, it defaults to six.

If 56.999999999999 is rounded to 6 digits after the decimal point, you get 57.000000.


Yes, thank you for this. I just ran it with the floor() function actually called and it did give 56.000000. I didn't think to check the precision. Rookie mistake!


Thus, if you have a minute, try "%.20f" instead of %f.

:)

Or, how about this: #include <stdio.h>

  int main(void)
  {
    for (int prec = 0; prec < 25; prec++)
    {
      printf("%.*f\n", prec, 57.0 / 100.0 * 100.0);
    }
    return 0;
  }
Output:

  57
  57.0
  57.00
  57.000
  57.0000
  57.00000
  57.000000
  57.0000000
  57.00000000
  57.000000000
  57.0000000000
  57.00000000000
  57.000000000000
  57.0000000000000
  56.99999999999999
  56.999999999999993
  56.9999999999999929
  56.99999999999999289
  56.999999999999992895
  56.9999999999999928946
  56.99999999999999289457
  56.999999999999992894573
  56.9999999999999928945726
  56.99999999999999289457264
  56.999999999999992894572642
The 64 bit double will store 15 decimal digits reliably. That is to say, if you have a decimal figure with 15 significant digits, which is in range of the type (and not mapping to a denormal value close to zero and whatnot), all 15 digits are representable and can be recovered.

In the reverse direction, you need about 17 decimal digits in order to capture an 64 bit double as decimal text such that the exact value can be recovered from the decimal text.

Thus in the above loop's output, once we are past 17 digits (including the 56 before the decimal point), we are no longer seeing any new data, just a continuation of the fraction.

And, notice how the last value that is still 57.000.... is exactly 15 digits wide. The next row is 16 digits, and that's where we now have 56.9999.... but 16 digits isn't quite enough to capture the value. I believe the next row gets us that: the ...99929. If we use that as a constant, any digits after that make no difference.

Programming languages which, by default, print floating-point values to 15 digits will show the nice result .1 + .2 = .3.

This is what I did in TXR Lisp.

  1> *print-flo-precision*
  15
  2> (+ .1 .2)
  0.3
  3> (set *print-flo-precision* 16)
  16
  4> (+ .1 .2)
  0.3
  5> (set *print-flo-precision* 17)
  17
  6> (+ .1 .2)
  0.30000000000000004
We can see there is no value difference in digits beyond 17:

  7> (eq 0.30000000000000004 0.300000000000000049)
  t
  8> (eq 0.30000000000000004 0.300000000000000040)
  t
To get different value (different floating-point bit pattern), we need a difference in the 17th digit. And not just a single increment:

  9> (eq 0.30000000000000004 0.30000000000000003)
  t
  10> (eq 0.30000000000000004 0.30000000000000002)
  t
The last digit being 3 and 2 is still mapping to the same value. When we make it 1, we start getting a different float:

  11> (eq 0.30000000000000004 0.30000000000000001)
  nil


It's possible for games to have over 100 achievements, which could lead to it rounding to 0% even when they have an achievement, or to 100% when they still have one missing.

People won't like this, so perhaps Steam put in some logic to adjust the numbers, and it is also affecting your case.

EDIT: But I would have gone for ceiling(99*achievements/total), which does give 57% in this case.


Compiler optimization.


Wrong guess; the algebraic optimization you might be thinking of is simply not allowed.

The expression is likely subject to an optimization known as constant folding: the compiler calculates the value of the constant expression, and substitutes that value into the code.

However, that constant-folding calculation has to produce the same result as what would happen at run-time: the 56.999999999999992895... approximation of 57.

Constant-folding having to produce the same results as run-time creates a challenge in cross-compiling situations, when the host machine's math is different from the target machine's math. The compiler must emulate the target machine math.


Compilers are very cautious about floating point optimization because many real number identities do not hold in FP, so they won't do much unless instructed to do so. Even if they do a constant propagation they will generally calculate in binary, in decimal. (Exception: decimal literals in some languages like Go are untyped and can be exactly calculated before getting rounded at once.)


I guess I'm misunderstanding what `-frounding-math` for gcc does / does not do.


And what optimization did you tell the C compiler to use?


(why did that get a downvote? It's C: optimizer flags are stupidly important, and half the things you think are defaults are actually undefined behaviour...)


I didn't downvote you. (In fact, I don't have access to the downvote button lol, but I wouldn't have used it here in any case.) But a comment above yours already discussed this possibility, and there were replies as to why this is not the problem here. C and Python are giving the same fp result just printed to different precisions. Ftr I was using the defaults, whatever those may be:

  gcc fp.c -o fp


I have no idea what a floating point is but I still enjoyed reading the comments. I bet there are many people like me who love HN but choose not to make their presence known so as not to be downvoted (full disclosure: I've never down- or up-voted. But I do know this: on the rare occasions I remark on how funny or clever I found something, that comment usually gets downvoted).


Computers store numbers in base 2 rather than base 10. That means each digit is no longer the ones/tens/hundreds place, but instead the ones/twos/fours/eights place. This is all fine when talking about integers, since every integer in base 10 can be represented in base 2.

The problem is when converting decimals. Now instead of each digit being tenths/hundredths/thousandths, we have halfs/fourths/eights/etc. Now try out the problem yourself. Imagine you had a formula in the form of

(1/2)x + (1/4)y + (1/8)z + (1/16)a + (1/32)b ...

Try to find a solution that adds up to exactly 0.1 and you'll see that it can't be done. Computers just get as close as they can.


Think of floating point as a "floating" "point" like the dot that separates the two parts of a number like 3.1415. It floats around because this allows you to have more precision when the number is small without giving up the ability to represent large number. Fixed point number encodings also exist and are much simpler, e.g. use 24 bits for the left hand side and use 8 bits for the decimal part.

Anyway, thought I'd give you some background. This stuff is easy to look up too. You're probably getting downvoted because HN doesn't really like unsubstantive comments.


> "...on the rare occasions I remark on how funny or clever I found something, that comment usually gets downvoted"

Not unjustifiably; without specifically referring to any posts you may have made, such comments in general add no insight or value to the discussion. It's just noise that has to be scrolled past.

(The same goes for discussing getting downvoted, which is discouraged by the HN guidelines.)


At first I was expecting an ill informed rant, until I saw it was Julia Evans, who always digs down and then explains a subject with great clarity.


> I think the reason that 0.1 + 0.2 prints out 0.3 in PHP is that PHP’s algorithm for displaying floating point numbers is less precise than Python’s

It's a display thing, not an algorithm thing. It rounds by default at a certain length, previously 17 digits.

    php > echo PHP_VERSION;
    8.2.1
    php > $zeropointthree = 0.1 + 0.2;
    php > echo $zeropointthree;
    0.3
    php > ini_set('precision', 100);
    php > echo $zeropointthree;
    0.3000000000000000444089209850062616169452667236328125
https://www.php.net/manual/en/ini.core.php#ini.precision


I love that this is a common enough problem, that there's a full domain website for it:

https://0.30000000000000004.com/


"The short answer is that 0.1 + 0.2 lies exactly between 2 floating point numbers, 0.3 and 0.30000000000000004, the answer is 0.30000000000000004 because its significand is even."


That's a poor summary. It's round(0.1) + round(.2) that sits between 0.3 and 0.3000...4 (rounding in base 2)


I'd love to see an in depth exploration of exactly how float imprecision happens. Why is the error at 4e-10 and not 3? What binary magic is causing these spurious results?

We've all been told about float imprecision in a very hand-wavy way that doesn't actually explain anything at all about the problem. On its face, one would assume that any two binary values being added would have a reliable and deterministic value, but that's not how floats work. There's some magic in the stack that causes errors to creep in, which isn't something we see with other types of numerical representation. It's very easy to show and understand that 01b + 01b = 10b, but that logic somehow doesn't apply to floats.

I think it's a very interesting subject and I wish there was more discussion of the real causes rather than just "oh yeah, floats do that sometimes, just ignore it"


I really like biginteger rational types when working with typical non-scientific non-integer problems, like classical accounting. Everything continues to work without a hitch when somebody decides they need to support a tenth of a cent (gas stations), they need to support a tax like a third of a percent, they need to seamlessly mix with smaller units like satoshis, ....

It's a solution that never fails (by hypothesis, you aren't working with sin or pi or gamma or some shit), and for every practical input the denominator stays small enough that it fits nicely into machine registers. You have the bigint fallback for the edge cases, in case they're needed.

Floats IMO are only suitable when you're actually okay with approximate results (perhaps including known convergence properties so that you can reify an ostensibly floating point result into more exact values).


Hopefully, we will have posits with hardwaresupport sooner than later.

https://spectrum.ieee.org/floating-point-numbers-posits-proc...

https://www.cs.cornell.edu/courses/cs6120/2019fa/blog/posits...

Unfortunately, they are not a solution to the OPs problem, which is fundamentally embedded in the architecture of computers. One has to find an appropriate representation for the needed numbers in bits and bytes.

In Python one can use fractions or decimals, if the float format is not good enough. Other options are fixed point arithmetic or arbitrary precision arithmetic. Choose one that combines the needed characteristics with the least amount of work.


TL;DR;

0.1 in binary, is a repeating decimal - just like how 1/3 in base 10 is a repeating decimal.

So you get errors due to rounding.

This is also why it's super important in loops to never use "=" as a condition statement, but instead use "<=" (or ">="). Otherwise you might create an infinite loop.


It is also sometimes useful to write abs(x-y) < epsilon.


A caveat here is that a good value for epsilon is dependent on the magnitude of the value you're abs'ing.


> super important in loops to never use "=" as a condition statement, but instead use "<=" (or ">="). Otherwise you might create an infinite loop.

Never seems a bit strong, it depends on the context. You can use = on int types all day.


But never with floating point. I think that's the context of the whole article, and should really be the main take-home (which I'm not sure the article gets across clearly enough).


you can use it with floats just fine if your condition boundary is a "whole number float", like running from 0.0 to 1.0 in steps of "some small increment". It's only when your boundary has a non-zero fraction that you're going to have to be more careful.


>running from 0.0 to 1.0 in steps of "some small increment".

I'm not sure what you mean; those aren't integers (if that's what you mean by "whole number float"):

python3

>>> 0.4 == 0.6 - 0.2

False

(edit: formatting for python)


fair, that was completely the wrong example.



The choice of decimal vs float depends on domain. For financial calculations you want to be able to represent amounts exactly. Decimal can represent $1.01 exactly, but float cannot (it picks a value very close to that, and as you do calculations eventually these small inaccuracies amount to real money).

But if you are storing the weight of a product a float might be fine - you don't really care if the system thinks you have 50.000001 pounds of product when you have 50 pounds (because your scale isn't that accurate anyway).

Floats will generally give you better performance than decimal types as well.


Thing is, the practical cases where you want to represent amounts exactly all involve decimals, not binary. If you default to decimals, and someone unknowingly uses them in a situation where binary floats would be better, they still get results they expected, just slower. But if you default to floats, and someone unknowingly uses them in a situation where an exact decimal amount is needed, it can actually produce incorrect (in the sense defined by the functional spec) results. Defaults should be safe, and performance optimizations that are more likely to affect correctness should be opt-in.


Would you say it's "odd" that popular languages (C#, Java, JavaScript/node.js/TypeScript, Python) don't have built in decimal libraries?


Java has BigDecimal. Python has a decimal and a fraction class. I’m assuming C# has something apart of their standard library as well. And JS/TS has third party libraries that can for arbitrary length precision math as well.


C# has it as part of the language itself, with keyword for the type and a special literal syntax for it:

   decimal n = 123.456m;


Thank you. I had a feeling I was wrong. So just JavaScript has BigInt/BigDecimal https://stackoverflow.com/questions/16742578/bigdecimal-in-j..., interesting


I don't think I would say it is odd because it isn't a type of number that is directly supported by the hardware. That said it is available in the standard library of most popular languages, and probably just a package install away for any other languages.

Also, a plain integer is usually sufficient for financial calculations (just count pennies, not dollars) so in some ways I think decimal types are overused when an integer would do just fine.


This is the first clear and concise explanation I've read about why a number as "simple" as 0.1 is imprecise as a float. Thanks!


This is one of my favorite interview questions. I embed the issue in a short block of code, run it, and ask the candidate to explain the what, why, and how to fix. I work in a field with physical measurements. Maybe 1/3 of candidates get it correct


What do you gain from asking this question in an interview? Let's be honest, you've spotted it online and used it to show that you have a higher understanding. If I had an interviewer ask me this it would be a massive red flag. You are interviewing a candidate - it isn't the place to recycle someone else's online answer to flex


No, we’ve actually had this issue in production. I did not “spot it online”. The question I ask is framed how the bug appeared for us at the time. It’s a great way to screen out people who are too academic and do not consider the limitations of the systems they use.


You asked an issue that _you_ faced in prod, you have the benefit of hindsight.

I'm assuming that you are interviewing candidates where this bug is not common.


This isn't a competition. They aren't in a classroom where the professor is asking a trick question. This person wants to hire someone who can spot the bug, understand it, and propose solutions.

Who cares if the candidate comes from a field where this isn't common? That's the point. You want to hire people who would know what to do if they came across it.

It's amazing the amount of hostility here towards a perfectly sensible interview question and process.


> If I had an interviewer ask me this it would be a massive red flag.

Do you work in a field involving physical measurements like OP does?


I love when someone figures out the detailed reasoning behind why some things are the way they are. Sometimes I go down those rabbit holes myself and never figure it out, but when I do, it feels so good! Just as reading articles like this.

Thanks, Julia!


I still kills me that (correct me if I'm wrong) the original version of SmallTalk on the Xerox Park Alto had a native arbitrary precision Fraction type.


By the time you finish reading this thread, you could have learned how floating-point numbers work, and wouldn't have to ask (or opinionate). :-P


This result is strictly a result of floating point math.

You never get this kind of result with fixed point or binary coded decimal math.

The computer Language M never has this kind of problem.


> This result is strictly a result of floating point math.

This specific result is a result of binary floating point math with a particular precision. More precision, or decimal floating point, will fix that, but have similar kinds of errors for the same operation on different numbers. fixed-size BCD/fixed-point math has other limitations, its not a general solution.

The general solution is to have a numeric tower where representations and operations meet the following rules:

1. The default representation of any exact literal (without a modifier representing a particular inexact representation, e.g., as an optimization or a necessity of interfacing with an external library) is exact,

2. Any operation on any operations between exact representations that can be done exactly is unless specified otherwise (as it might be for the same reasons discussed above), and stored in a representation that can represent the result exactly.

3. Essentially inexact operations or operations on inexact numbers are conducted in a way and produce output representations that minimize additional imprecision introduced, except when explicitly specified otherwise.

Computer algebra systems where the “top level representation is symbolic are potentially the ultimate expression of this, but the Scheme numeric tower is pretty good (but at least Racket, and I think schemes in general, represent exact decimal fractions as floats still, so don’t entirely avoid the problem, but at least division of integers produces exact rationals.) Lots of languages default to putting numbers expressed as literals into either fixed-sized integers (not bad, especially when those are often 64-bit now which rarely has much practical distinction from arbitrary precision in most applications) or fixed-sized binary floats (which are more problematic, especially given the mismatch between clean binary and clean decimal representations.) This is very good for efficiency, because computers can process fixed sized integers and binary floats very quickly. But, especially for floats, it can be bad for correctness when doing arithmetic where the input is all clean decimal literals.


Yes, Scheme has the concept of exactness[0]. In most Schemes, (exact? 2.3) ==> #f. But at least the exact? procedure exists. It can tell you whether your result might have been affected by these problems. If you want perfect answers, you can always use the rationals that are built into Scheme implementations with the full numeric tower.

[0] https://standards.scheme.org/corrected-r7rs/r7rs-Z-H-8.html#...


(inexact->exact x) and (exact->inexact x).


What about them?


It is not "floating point math" but "floating point math where the exponent is expressed as a power of 2" that is the problem.

That is, if the exponent is a power of 2 you can write 1/2, 1/4, 1/8 exactly but you can't write 1/3, 1/5, 1/10, etc.

If the exponent is base 10 then you can write 1/5, 1/10, 1/1000 and such exactly.

Note the mantissa and the exponent are both integers and so far as this problem is concerned it does not matter if these are written in binary or BCD or some other representation. There is some controversy about what is better, if you use a binary mantissa the math is a little faster and more accurate, if you use a decimal mantissa conversions to and from ASCII are quicker and ASCII conversions are a major part of real life math workloads.

I've long thought that this problem is one of a list of problems that many people encounter on the path to bending computers to their will and that some people decide that computer programming isn't for them because of this kind of problem. I think the kind of person who learns Python to put their outside-of-computing skills on wheels is particularly affected.

It is a "disruptive technology" problem because the person who is using IEEE floats heavily has accommodated to this problem and would not give up the slightest amount of performance. Decimal FP can be implemented in software but is slow in software. IBM has had hardware Decimal FP in their mainframes for a very long time and there is even an IEEE standard. (A company that has been using mainframes for a long time cut a check for the wrong amount because the abused the number system back in 1963 and thus learned their lesson a long time ago.)

The best hope I have is that the "social justice" people can be led to believe that unintuitive numerics keep underrepresented people out of the field and that they threaten Intel that they'll tear down their headquarters unless they catch up to where mainframes were 50 years ago. It could be a huge win for the industry and for the DEI office because employers would have to buy everyone a new computer and even white guys might think the DEI office was doing good work if it meant they got to replace their 5 year old corporate craptop. I mean, how is it that a few people with two fingers get to oppress all the rest of us with ten?


Tangentially (and separate from my other response because it's a whole different issue):

> The computer Language M

Which one? MUMPS (also known as M), or the Power Query Formula Language (also known as M)? OR something else?


M alternately named MUMPS

M guarantees over 15 digits of precision, which is why it is heavily used in banking and financial applications


Fixed point behaves well under addition and multiplication by integers. Otherwise it isn't very good. There are far more numbers in the range (1,infinity) than (0,1) for example, so 1/x loses catastrophic amounts of information.


How do you represent 1/10 exactly in binary fixed-point representation?


> How do you represent 1/10 exactly in binary fixed-point representation?

The base of your number system does not need to be the same as the base associated with the fixed point position.

Easiest to explain: if you have a uint64 that represents a monetary value, you can let it express the number of dollarcents instead of dollars. Then you can express 1/10 dollar as 10 dollarcents.


The claim was "you never get this kind of result with fixed point math", not "you can contrive to avoid this kind of result with fixed point math if you use a different base for the fractional part".


...or, indeed, 1/3 as binary coded decimal.


1/10 is exactly 0.1


The article isn’t about the imprecision in general, but rather about the details of this particular equations imprecision, and how that leads to this particular answer.


So 1/3 + 1/3 + 1/3 = 1 in M, right?


Well, it is in racket?

  Welcome to Racket v8.7 [cs].
  > (/ 1 3)
  1/3
  > (+ (/ 1 3) (/ 1 3) (/ 1 3))
  1
  > (integer? (+ (/ 1 3) (/ 1 3) (/ 1 3)))
  #t


It doesn't have to

    $ racket
    Welcome to Racket
    > (read-decimal-as-inexact #f)
    > (+ 0.1 0.2)
    3/10
Lovely.


You can use Herbie to increase the accuracy of your floating point operations.

From the tutorial:

Herbie rewrites floating point expressions to make them more accurate. Floating point arithmetic is inaccurate; even 0.1 + 0.2 ≠ 0.3 for a computer. Herbie helps find and fix these mysterious inaccuracies.

https://herbie.uwplse.org/



The entire intrigue of the opening paragraph is the claim:

> there’s a floating point number that’s closer to 0.3 than 0.30000000000000004!

But then you read to the end and find out this wasn’t true, and the answer to the clickbait headline is just, “Because it’s the closest floating-point value to the correct answer”.


You're mistaken. In fact 0.3 is closer to the next-smaller float than it is to the next-larger one which results. But the result of float(0.1) + float(0.2) is not any of those things, but rather halfway between the previous and next floats from 0.3, so it gets rounded up instead of down.


Wow the "we should be using decimal" takes in this thread are hilariously misguided.


Why isn't the floating point version of 0.1 just 0.10000000000000000000000000000000000000000000000000000000000000000000000000000000 and something like 0.10000000000000000555111512312578270211815834045410156250000000000000000000000000 ?



For those who prefer a more visual and interactive demo: https://evanw.github.io/float-toy/

It's not very complicated when you see it in bits


Because .3 cannot be described by a float exactly.

And C/C++ doesn't have binary coded decimal natively. If it did, then much of what we use decimals and fractions for would be easier to show in decimal.


I'm sure multiple variants have been implemented, Borland C++ V2.0 from 1991 (version chosen as it's on bitsavers.org) has a bcd type which IIRC uses the 80 bit 8087 packed decimal type but it's been a long time since I looked at this deeply.


I remember we had Borland Turbo Pascal 3.0 with BCD in the mid 1980s.


Simple explanation: because computers do not 'understand' the concept of decimals. 'Natively', they can only manipulate integers.


Computers, as we use them, to operations on finite sets.

Real numbers are simulated using a finite set of bits.

So equality comparisons are not useful for floating point numbers

This is computing 101


Looks like the title got messed up in the process; the + sign was definitely not supposed to be 'and'.


People often use “and” to mean addition. “One and one make two”. So I think the title works fine.


I personally read it as 0.1 and 0.2 are both equal to 0.3000-whatever. Maybe it's because English is my second language.


"and" as a conjunction is sometimes used to mean "plus". As in "two and two make four."


> Why does 0.1 and 0.2 = 0.30000000000000004?

Because we have 10 fingers rather than 8.


It doesn't. That might be number as returned from your software but it is inexact.


Test with Unicon:

procedure main() if (0.1 + 0.2 == 0.3) then { write("true") } end

It prints "true".


Why does the title miss the key words "sum of" or the symbol "+"?


Hard mode: what application would be effected by .3 being off 10^-17


And that's the reason why Decimal types are your friend, kids.


Very large values of 0.1


an aside (to Julia's always excellent explanations) - it sure would be nice if python had first class support for decimals :/


Because JavaScript hurr durr


The answer is: "If it matters to you, you should not be doing FP math."


If that .000...004 matters to you, your application requires more precision than is available from the IEEE 64 bit double.


Well.. IEEE754 have well defined behavior and detail reasoning. If it matter to you, you may still want FP math


because it is right


because its right


TL;DR: Bocs of IEEE 754 encoding, yo


tldr; use Decimal


The proper tldr is - understand how floating point works (you can represent a huge range of numbers with a lot of precision, but not perfect accuracy, and high performance), and then decide if it is the proper data type for your problem. For example setting the heading and velocity of a missile (85.623 degrees, 500.138 m/s (idk how fast missiles go) floats are great. It is impossible to steer on an exact course anyway due to wind, temperature, etc (I assume...)

Storing the number of dollars in you savings account - us decimal or better yet integer (just count the pennies).


IEEE 64 bit double can represent a missile velocity of 85.6230000000000.


Because mantissa and exponents

CS101 -- can we please move on.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: