> There's no catastrophic cancellation or anything. And floating point has much more dynamic range than precision, so to speak, so until h literally reaches zero you aren't running into trouble.
I just tried it to be sure, in C with "double", and organized the calculation the way it's described in the paper you cited, formula 2.1.
It failed due to catastrophic rational cancellation close to an interpolation point, depending on the values involved. An equality test was not enough.
Approaching the first interpolation point, before it transitioned from the correct result, there were some incorrect Inf results prior to it switching to Nan for an input not equal to any interpolation point.
This was trivial 1d interpolation between two points, where the points were (0.0 => 10.0, 1.0 => 20.0).
Hm, interesting. I also tried some numerical experiments to sanity-check and I got reasonable results right up to 1ulp away from the interpolation points. I tried again using exactly your example -- f(x) = 10+10x, interpolation points at 0 and 1 -- and again everything seemed fine.
I wonder what we're doing differently, or whether we happened to test at different points. Here's my code:
def interpolate1(f,x,xs,ws):
p,s = 1,0
for (xj,wj) in zip(xs,ws):
p *= x-xj
s += wj*f(xj)/(x-xj)
return p*s
(it's "interpolate1" because I also implemented the other version of the formula described in the article, which for this purpose behaves very similarly).
For your example we want f(x) = 10+10x, xs=[0,1], ws=[-1,1].
As I mentioned before, if you implement the calculation of the polynomial called l(x) in the paper -- the thing my code calls p -- in an unfortunate way, you can get cancellation problems, but for this case I'm not sure I see any way for that to happen.
I tried calculating l(x) in a different way that could conceivably cause cancellation issues, aimed at this particular interpolant:
def foo(x):
p = x*x-x
s = -1*10/x + 1*20/(x-1)
return p*s
but this also gave reasonable results for every value of x I tried other than exactly 0 and exactly 1.
What exact values of x are giving you what bad results? What language are you implementing this in, with what data types?*
If you want to implement the “first barycentric formula” instead you need to take some additional care to avoid overflow/underflow.
As the papers above note, the choice of interpolation nodes makes a big difference. If you have an arbitrary function to approximate on an arbitrary complex domain, take a look at https://arxiv.org/abs/1612.00337
There are advantages to using either version of the formula. If you use the "first" then you do need to take care about overflow/underflow when the number of interpolation points is large, but that isn't an issue here. The "first" behaves better than the "second" for extrapolation, as discussed in the paper by Trefethen et al. And yes, the choice of interpolation points is important.
But none of that should make a difference here. What's actually causing the dubious results jlokier reports isn't exactly proximity to the interpolation points -- I don't think it could happen if one of them weren't exactly zero. What's happening is that the very smallest values of x in jlokier's code are denormal numbers; that is, ones whose floating-point representation has less precision than usual.
Usually a floating-point number looks like 1.<stuff> x 2^n; for IEEE double precision, the <stuff> is 52 bits long. So if you start with 1, 1/2, 1/4, 1/8, and keep dividing by 2, you always have stuff=0 and n decreases by 1 each time. But what should you do once you reach the smallest value of n that will fit in the floating-point format? One answer is just to say that after that you get zero. But you can get a little extra dynamic range if instead you say that when n is as small as it's allowed to be, you can use 0.<stuff> instead of 1.<stuff>. These numbers have reduced precision for the sake of extra dynamic range, and they are called "denormal". Lots of things go subtly (or not so subtly) wrong when dealing with denormal numbers.
One thing that goes wrong is the interpolation formula we're discussing here. The formula looks like f(x) ~= l(x) . sum of wj f(xj) / (x-xj) where xj are the interpolation points (in this case 0 and 1), l(x) is the product of (x-xj), so in this case x(x-1), and wj = product (xj-xi) where xi ranges over all the other interpolation points besides xj, so w0 = -1 and w1 = +1. So we have f(x) ~= x(x-1) . [-10/x + 20/(x-1)]. In fact this is exact provided x is neither 0 or 1, because the thing we're approximating is a polynomial of degree <= (# interpolation points) - 1.
So, what I claimed before was: suppose x = xj + h where h is small, and suppose you compute l(x) as the product of its factors. Then we get l(x) = h times product (xj+h-xi), and the sum is wj f(xj) / h + the sum of the other terms; and if h is tiny then the other terms in the sum are tiny in comparison, and the factors of h and 1/h cancel and everything's fine.
But that doesn't happen when xj=0 and x (and hence h) is denormal -- because you can't pull the denormal trick at the large end of the range, so the reciprocal of a denormal floating-point number is floating-point infinity!
This isn't a cancellation problem. It isn't a matter of the formula going unstable when x is very close to one of the interpolation points. But it is a problem, and it never occurred to me until jlokier demonstrated it, because denormals are kinda obscure and I didn't think to think about what might happen if they get involved. It would be avoided if you made your interpolator check for values very close to one of the xj. But -- I think, but this stuff is treacherous -- it would also be avoided if you checked for infinities as well as NaNs on the output side, which would be more efficient (provided your inputs are usually sensible). Or -- I think, but this stuff is treacherous -- you could just check explicitly for denormal values of x and maybe of the xj, because if none of those is denormal this pathology can't arise.
This still gives incorrect +Inf results in an interval around the first interpolation point.
The interval is small for parameters interp(x, 0, 10, 1, 20). But interp(x, 0, 1e50, 1, 2e50) shows the interval can be much larger, as it depends on the y values.
Both formulas return incorrect results in an input range larger than the denormals (1e50 times larger in the second example above). Plenty of "normal" inputs that can arise from other calculations. And these parameters are not unusual either.
So you can't ensure a correct result from either barycentric formula in floating point by just checking for denormal input.
I think you may be right that checking for +/-Inf result as well as NaN may be an adequate test, but I wouldn't assume it without a proof or some paper demonstrating that for sure.
Nice catch on using enormous values of y, where indeed it isn't just denormal values of x that cause trouble. Still not a catastrophic-cancellation problem, though. Let's look more closely. Suppose x is very very close to 0. (In particular, close enough that x-1 is indistinguishable from -1. This is not a source of trouble in this case.)
The weights w1,w2 are -1 and +1, as before.
The numerator is -10^50/x - 10^250. The denominator is -1/x - 1. If x is large enough (which it is in the cases that give spurious infinities) the first term in each of these is enough larger that adding the second doesn't change it; so what you're really computing is (-10^50/x) / (-1/x). And if x is small enough that 10^50/x is infinite, then boom.
Something similar could happen even without numbers as large as 10^50. If the value at 0 were 3 instead, then values of x bigger than 1/3 the largest representable non-infinity would produce the same problem. Using f(0)=10^50 enlarges the range in which this particular overflow happens by a factor of 10^50, though.
The fact that this isn't an exotic catastrophic-cancellation problem doesn't, of course, detract from your original point (with which, for the avoidance of doubt, I agree): a calculation that can blow up for certain exact values is likely to misbehave for values that are merely close to those, so testing for the exact values is likely to be a very bad idea.
If I were actually writing something that does interpolation in this way, I would certainly not want to trust that checking the results for infs and nans was sufficient (without, as you say, either proving it or finding someone else who had -- but I suspect it's actually only almost true).
If x is very small (<1e-100 or something) you can alternatively evaluate a linear approximation accurate near 0. If y is going to be enormous (>1e+100 or something) you should rescale it.
I just tried it to be sure, in C with "double", and organized the calculation the way it's described in the paper you cited, formula 2.1.
It failed due to catastrophic rational cancellation close to an interpolation point, depending on the values involved. An equality test was not enough.
Approaching the first interpolation point, before it transitioned from the correct result, there were some incorrect Inf results prior to it switching to Nan for an input not equal to any interpolation point.
This was trivial 1d interpolation between two points, where the points were (0.0 => 10.0, 1.0 => 20.0).