Hacker News new | past | comments | ask | show | jobs | submit login

That isn't exactly true. Floating point error is only really unbounded if you hit a cancellation (ie subtracting two big numbers), which you can avoid by doing some algebra.



That is totally wrong.

You can not in general avoid cancellation, even claiming that is ridiculous. WTF are you even saying.


There is no generic algorithm to completely prevent cancelation. However, there are a lot of specific little ways you can do algebra to push it around so it doesn't hurt you badly (note that I said "avoid", not "prevent"). I would conjecture that the vast majority of numerical systems can be designed that way if you take the time to think about it.

Or you can just use something like herbie that thinks about it for you: https://herbie.uwplse.org/


>I would conjecture that the vast majority of numerical systems can be designed that way if you take the time to think about it.

Sometimes there are ways to mitigate this, sometimes there aren't. Sometimes you need to precondition, sometimes you need to rearrange, sometimes you need a different algorithm, sometimes you need to normalize, sometimes you need to use a different arithmetic and so on.

For solving linear systems alone, there are definitely thousands of papers dealing with the problems arising from this. For every single algorithm you write and for all data which comes into that algorithm, you need a careful analysis if you want to exclude the potential of significant numerical errors.

Your comment makes it seem like this is a small problem, where you can just look at an algorithm for a time and fix it, this is literally a hundred year research project in numerics.


> Sometimes there are ways to mitigate this, sometimes there aren't. Sometimes you need to precondition, sometimes you need to rearrange, sometimes you need a different algorithm, sometimes you need to normalize, sometimes you need to use a different arithmetic and so on.

> For solving linear systems alone, there are definitely thousands of papers dealing with the problems arising from this. For every single algorithm you write and for all data which comes into that algorithm, you need a careful analysis if you want to exclude the potential of significant numerical errors.

It sounds like we agree that cancellation is avoidable with some analysis, and there are hundreds of techniques you can use to deal with it, but mostly it's the ~5 you listed there. And as you suggest, I don't believe this is nearly as significant a problem in the general case as you think it is. A careful error analysis is possible if you care (and if ever you cared, it would be on a spacecraft), and far easier in floating point than in many other number systems, including fixed point number systems.

Numeric systems that truly fix cancellation are incredibly big and heavy, and cannot usually be used for real-time calculations in a generic form. Fixed point certainly doesn't fix cancellation - it introduces precision loss issues on every operation you do that causes a number to go down in magnitude. It is actually harder to design systems in fixed point that avoid massive precision losses than it is in floating point, and the error analysis is much more substantial.


>I don't believe this is nearly as significant a problem in the general case as you think it is.

My original comment was about manned space flight in particular. If your application is relatively generic I think it is completely okay, if you are aware of it and mitigate the most pressing issues.

>Numeric systems that truly fix cancellation are incredibly big and heavy, and cannot usually be used for real-time calculations in a generic form.

You can use interval arithmetic, which guarantees that you at least know when cancellation has occurred. Interval arithmetic is fast enough for real time, although it has its own significant drawbacks.

> It is actually harder to design systems in fixed point that avoid massive precision losses than it is in floating point, and the error analysis is much more substantial.

Absolutely. My point was, that a manned space craft, might just be the point to do it.


> My original comment was about manned space flight in particular. If your application is relatively generic I think it is completely okay, if you are aware of it and mitigate the most pressing issues.

Everything we have been talking about relates to space flight. In fact, with humans on board, you can afford to be a lot less precise, because they can work around most numerical issues by hand. The Apollo guidance computers, for example, were prone to occasional instances of gimbal lock and numerical instability, and the astronauts just fixed it.

> You can use interval arithmetic, which guarantees that you at least know when cancellation has occurred. Interval arithmetic is fast enough for real time, although it has its own significant drawbacks.

Interval arithmetic does not prevent cancellation. It's just two floating point calculations, both of which are actually less precise than the one you would do otherwise (you don't use default rounding for interval arithmetic, you round the bottom down and the top up). You do know when things have been canceled, but you know that in a floating point calculation anyway if you have done the error analysis.

My overall point here is that NASA isn't missing anything by using floating point instead of using other weird or exotic arithmetic systems. Double-precision floating point combined with a rudimentary error analysis and some algebra is good enough for pretty much everything, and you may not be able to do better at all with fixed point. Designing fixed point algorithms also depends on a very careful analysis of interval ranges and precisions, and often gets you nothing over just using "double", where the error analysis is easier anyway.

If you need to do better than double, there's also double-double arithmetic for your hard parts, which is a similar speed to interval arithmetic and doubles the precision you get beyond double.


Why not?


Because if your algorithm contains an "-" and you don't have complete control over the inputs to both sides of that minus, you will have cancellation.

There is no general way to mitigate that, you can use numerically superior algorithms or screen your inputs, but these only help in specific cases. There is no general way to avoid this, every algorithm needs to be treated specifically.


Okay, so, in practice, we do know lots of things about the inputs. It may be impossible "in the general case," but engineering is done in the specific case. Anybody writing control software for spacecraft knows whether the mass of the moon or the spacecraft is larger, and can rearrange their equations accordingly.


That's not really related to floating point though. You'll have the same issues for fixed point.


>That's not really related to floating point though. You'll have the same issues for fixed point.

You don't. "-" is exact for fixed point unless the operation falls outside the range of valid values.


Exact and precise are different ideas. In fixed point, all operations but division are exact. In fixed point, all operations have precision related to the magnitude of the numbers being operated on. You can have a 64-bit fixed point number system that gives you 16 bits of precision on most of your operations.

In floating point, almost every operator (other than subtraction) has precision of the full width of the mantissa minus 0.5 ULPs. All operators are not guaranteed to always be exact, but they are far more precise on average than equivalent operators in fixed point.

Cancellation isn't an issue of exactness, it's an issue of precision.


Sure, but errors for fixed point happen very differently to floating point errors.

E.g. (a-b)*c, which is the common example for cancellation, if a and b are very close, can have an unbounded error compared to the result in the real numbers, in floating point. Since all operations besides "/" are exact in fixed point, no error can be introduced by this operation in fixed point (if all operations are representable).

Claiming that fixed and floating point are suffering the same way is just wrong.


(a - b) in fixed point will have very low precision, too, and will generally be exact in floating point. The following multiplication by c may or may not be exact in floating point (or in fixed point, mind you - multiplication of two n bit numbers exactly needs 2n bits of precision, which I doubt you will give yourself in fixed point). The "unbounded" error comes in when a and b themselves are not precise, and you have that same precision problem in fixed point as you do in floating point.

For example, suppose your fixed point format is "the integers" and your floating point format has 6 significant digits: if you have real-valued a = 100000.5 and b = 100001.9, both number systems will round a to 100001 and b to 100002. In both cases, (b - a) will be 1 while (b - a) should be 1.4 if done in the reals. That rounding problem exists in fixed point just as much as in floating point. In both systems, the operation that causes the cancellation is itself an exact calculation, but the issue is that it's not precise. Fixed point will just give you 1 in the register while floating point will add a bunch of spurious trailing zeros. Floating point can represent 1.4, though, while fixed point can't. If a and b were represented exactly (a = 100001 and b = 100002 in the reals), there would be no problem in either number system.

The only times that you get better cancellation behavior are when you have more precision to the initial results, which when comparing double precision float to 64-bit fixed point comes when your operands in fixed point have their MSB at the 53rd position or above. That only happens when your dynamic range is so deeply limited that you can't do much math.

When you are thinking about cancellation numerically, exact is a red herring. Precise is what you want to think about.


If a and b are very close this will break in fixed point too.

It depends on how many bits you have. For example compare f128 with Q8.8. Which one do you think would give better astronomical calculation results?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: