Hacker News new | past | comments | ask | show | jobs | submit login
Polynomial interpolation (cohost.org)
97 points by atan2 on Feb 8, 2023 | hide | past | favorite | 45 comments



Related: Fast Fourier Transform (FFT) can be used for polynomial interpolation in O(nlogn) time (if you get to choose the points, and use “complex roots of unity”), where n is the degree of the polynomial, and is used as a step in multiplication of polynomials in O(nlogn) as well.

This video covers it well: https://youtu.be/h7apO7q16V0

Also, chapter 30 of CLRS: https://elec3004.uqcloud.net/ebooks/CLRS%20(2nd%20Ed)%20-%20...


What applications require polynomial interpolation with such high order that an FFT is worth it? Why not use sinc interpolation if you're computing an FFT anyway?

sidenote: polynomial multiplication != interpolation


Zero-knowledge proof generation, for one.

Edit, see for example: https://www.zprize.io/prizes/accelerating-ntt-operations-on-...


That’s pretty neat. I didn’t even think of finite fields when I made my other comment. Another example where mathematical curiosities prove useful in cryptography.


There's an aside in the linked section of CLRS that mentions that FFT based polynomial multiplication can be used for fast evaluation of Lagrange interpolation. I'm not aware of applications though, Lagrange interpolation isn't numerically stable; there are better interpolation methods for large datasets. It's more of a mathematical curiosity.


Not sure why you would use FFTs to evaluate Lagrange interpolators, when Lagrange interpolators have a very pretty O(N) implementation.


The FFT is not used for that. For multiplying two polynomials C(x) = A(x) B(x) then FFT is used to evaluate the polynomials (evaluated at complex roots of unity) to FIND the Lagrange interpolating polynomials for A(x), and B(x) in O(nlogn) time. Then the Lagrange interpolation is evaluated in O(n) time compute the y values for y= A(x)B(x), then FFT is used again to compute the coefficients of the final C(x). This is discussed on pages 825-827 of CLRS and the diagram on page 828 shows this in a nice diagram (the pointwise multiplication refers to evaluating the Lagrange interpolating polynomials).


There are n terms with n products so O(n^2).


Example please?


:-( Walking it back a bit. For common cases, O[N], but O[N^2] for initialization.

https://news.ycombinator.com/edit?id=34720462


FFT is almost never not worth it. Even at small scales, you should be using the FFT (and you are a daily user in audio/video/image contexts).


We have very different experiences then. "Small" scale to me is n < 64, which is right on the threshold of whether it's faster to run an n^2 algorithm that could be nlogn with an FFT (for 1-dimensional vectors) - depending on the algorithm, implementation, and hardware of course.

I've benchmarked FFT/non-FFT based approaches for almost everything in that scale for the last ten years and gotten quite different results.


An interesting case is two-dimensional interpolation. For example, for precise image resampling. In that case, due to the separability, the FFT is still O(nlog(n)), thus just as fast and much more precise than space-domain computations with 2D splines. Try to rotate 10 degrees iteratively an image 36 times with splines and with fft-based methods. Even 7 tap splines (which is a huge order) will utterly destroy the image contents; but it is just peanuts to the fft.


"CLRS" apparently refers to

https://en.wikipedia.org/wiki/Introduction_to_Algorithms

And CLR (in a sibling comment) refers to the 1990 first edition.


Just as a side note this is chapter 32 in CLR for those of you not with the times.


The most efficient implementation of Lagrange Interpolators is O(N).

Calculate the left Product(0..i-1) of each term in left-to-right order, and the Product(i+1..N-1) of each term in right-to-left order. Your mileage may vary on a GPU.


I think you're talking about evaluating the interpolating polynomial at a single value, but others are talking about getting all of the coefficients of the interpolating polynomial.


GP is referring to https://en.wikipedia.org/wiki/Lagrange_polynomial

Build a table left[j] of product-of-ratios for i < j, and another right[k] for i > k, each in linear-time, and combine the two to find the product-of-ratio for i != m in linear time as well.


Right, and if we are trying to get all coefficients and are thus performing the polynomial multiplications symbolically, then performing the multiplications naively will take O(n^2) time but using FFT allows it to be done in O(n log n) time.


:-( Walking it back a bit. For common cases, O[N], but O[N^2] for initialization.

https://news.ycombinator.com/edit?id=34720462


I'm not aware of any algorithm for Lagrange interpolation that is O(N), and wikipedia does not know one either.

What do you mean by left product and right product?


Here's some Clojure code I wrote for the Emmy computer algebra system that implements polynomial interpolation with a few different algorithms described in Numerical Recipes:

https://github.com/mentat-collective/emmy/blob/main/src/emmy...

I discovered while writing this that I could express each of these algorithms as folds that consumed a stream of points, accumulating a progressively higher order polynomial.

Here's the same sort of thing but for rational function interpolation: https://github.com/mentat-collective/emmy/blob/main/src/emmy...


I want to commend you for the detail and quality in the comments in your code.

These days that's a luxury. Open source projects tend to take the "and then a miracle occurs" approach to comments.

I'm old school, my comments look like yours; lots of detail, background and explanations, so I can easily get back into the code five or ten years later.


Thank you, that's really nice to read.

I really leaned in to these essay-style namespaces for this project; I spent _so_ much time deciphering old numerical algorithms ported from uncommented FORTRAN, and decided that I had to go the other direction.

Someday I'll get around to publishing a static site that renders the math in the comments, like these:

- https://samritchie.io/functional-numerical-methods (numerical integration written in stateless functional style) - https://samritchie.io/dual-numbers-and-automatic-differentia... (automatic differentiation that works on higher order functions etc)

Send me some of your (public) code to read!


> Send me some of your (public) code to read!

That's a bit difficult, as most of my work is proprietary, under NDA or aerospace.

Here's an excerpt from a C control panel menu processor, just to show how far I go. One line of active code vs. 21 lines of comments. This was written 16 years ago. I can get back into this codebase almost instantly because of the documentation. More importantly, code reuse is also very easy for the same reason.

    case BUTTON_ENTER:
        // ENTER button clicked
        // 
        // This has the effect of calling the "DURING" handler for the currently selected
        // menu.
        // 
        // ***NOTE***
        // It is the responsibility of the ON_ENTRY handler to decide whether or not
        // to capture the message loop by asserting "state_handler_wants_priority"
        // 
        // Once done with the message loop "state_handler_wants_priority" should be cleared,
        // typically in the ON_EXIT routine
        // 
        // It is up to the handler to do something (or not) with messages.
        // 
        // Parent menus do not generally do much because they are navigation
        // stopping points and their messages are fully handled by the menu_processor() parser.
        //
        // Child menus that are not parents and, therefore, perform a function, 
        // might want to do some work before the menu_processor() parser by capturing messages
        // before they are delivered to the menu_processor() parser.
        //
        dispatch_handler(current_state, DURING);
        break;

I also like to do this sort of thing, in this case, Verilog code from a custom FPGA-based SDRAM controller:

    // COMMAND TABLE ----------------------------------------------------------------------------------
    // Use the following template for assignment of SDRAM commands from this table: 
    //
    // {SDRAM_CSn, SDRAM_RASn, SDRAM_CASn, SDRAM_WRn} <= COMMAND_NOP; // Issue command 
    //
    //                            CSn   RASn  CASn  WRn
    //                             |     |     |     |
    parameter COMMAND_NOP    = {1'b0, 1'b1, 1'b1, 1'b1};
    parameter COMMAND_ACT    = {1'b0, 1'b0, 1'b1, 1'b1};
    parameter COMMAND_RD     = {1'b0, 1'b1, 1'b0, 1'b1};
    parameter COMMAND_WR     = {1'b0, 1'b1, 1'b0, 1'b0};
    parameter COMMAND_BST    = {1'b0, 1'b1, 1'b1, 1'b0};
    parameter COMMAND_PRE    = {1'b0, 1'b0, 1'b1, 1'b0};
    parameter COMMAND_REF    = {1'b0, 1'b0, 1'b0, 1'b1};
    parameter COMMAND_LMR    = {1'b0, 1'b0, 1'b0, 1'b0};
    parameter COMMAND_IOEN   = {1'b0, 1'b0, 1'b0, 1'b0};
    parameter COMMAND_IODIS  = {1'b0, 1'b0, 1'b0, 1'b0};
    
    parameter MODE_REGISTER_DEFAULT = 14'b000000_0_011_0_111;    // CAS length = 3
    //                                      |       |  |  |
    //                                      |       |  |  Burst length: 000 = 1; 001 = 2; 010 = 4; 011 = 8; 111 = Full page
    //                                      |       |  |
    //                                      |       |  Burst type: 0 = Sequential; 1 = Interleaved
    //                                      |       |
    //                                      |       CAS Latency: 010 = 2; 011 = 3;
    //                                      |
    //                                      Write mode: 000000 = Burst read and burst write.
    //                                                  000010 = Burst read and single write.

Yes, it's a lot of work, and it is, without a doubt, worth the effort.


Check out https://www.chebfun.org/ if you want to see some crazy stuff you can do with polynomials!

Also the book Approximation Theory and Approximation Practice

https://epubs.siam.org/doi/book/10.1137/1.9781611975949

is really great, by the author of Chebfun.


Approximate approximation theory practice.


See Lagrange interpolators for the general form of the interpolator used in this article. https://en.wikipedia.org/wiki/Lagrange_polynomial

There's a neater general formula for generating Lagrange interpolator polynomials of any order that avoids having to perform matrix inversion.

Interpolated values can be calculated in O(N) time, so Lagrange interpolation is always faster than FFT interpolation. Calculate the left Product(0..i-1) of each term in left-to-right order. If you carry the value from term to term, you only need one multiplication per term. and the Product(i+1..N-1) of each term in right-to-left order.

Lagrange interpolators with degrees that are close to PiN tend to be smoother, with less aliasing, when processing audio signals. (N=3, 6, 9, 13, &c). Truncating the Langrange polynomial is akin to windowing in FFT. Having an order that's roughly PiN ensures that the analogous window of the Langrange polynomial has smaller tail values.


Walking that back a bit. Given a set of points (x[n],y[n]), the interpolated value can be calculated in O(N^2) time if the x coordinates change, and in O(N) time if only the y values change. For common cases, this means O(N^2) initialization, and O(N) evaluation, even if the Y values change.

Source: https://people.maths.ox.ac.uk/trefethen/barycentric.pdf

This paper is (admittedly) a significant improvement to the method I've been using to date. Probably the most significant improvement is a method for calculating Lagrange polynomials that provide stability guarantees. After having read it, I'm going to point you to that paper, instead of describing how I did it.

The application that I'm familiar with, is interpolation of audio signals, and fractional delay lines. In these cases, x[N] can be considered to have the fixed values [0,1,2,3,...], and then subsequent interpolations can be calculated using an offset into the buffer of audio samples. Initialization takes O(N^2). Evaluation takes O(N).


I made a joke posting to the comp.lang.c Usenet newsgroup on this topic. Looks like it was in 2014.

https://groups.google.com/g/comp.lang.c/c/s7yeYGuMs8s/m/juwf...

Google Groups no longer allows "View Original". Let's reformat the code. Actually, nope; I found the file from 2014 easily.

Seventh degree polynomial for Fibonacci, with Easter egg:

  #include <math.h>
  #include <stdio.h>

  int fib(int n)
  {
    double out = 0.002579370;
    out *= n; out -= 0.065277776;
    out *= n; out += 0.659722220;
    out *= n; out -= 3.381944442;
    out *= n; out += 9.230555548;
    out *= n; out -= 12.552777774;
    out *= n; out += 7.107142857;
    out *= n;
    return floor(out + 0.5);
  }

  int main(void)
  {
    int i;
    for (i = 0; i < 9; i++) {
      printf("fib(%d) = %d\n", i, fib(i));
      switch (i) {
      case 7:
        puts("^ So, is this calculation fib or not?");
        break;
      case 8:
        puts("^ Answer to life, universe & everything responds negatively!");
      }
    }
    return 0;
  }
Execution:

  $ ./fib2
  fib(0) = 0
  fib(1) = 1
  fib(2) = 1
  fib(3) = 2
  fib(4) = 3
  fib(5) = 5
  fib(6) = 8
  fib(7) = 13
  ^ So, is this calculation fib or not?
  fib(8) = 42
  ^ Answer to life, universe & everything responds negatively


It's just

    ax1^3 + bx1^2 + cx1 + d = y1
    ax2^3 + bx2^2 + cx2 + d = y2
    ax3^3 + bx3^2 + cx3 + d = y3
    ax4^3 + bx4^2 + cx4 + d = y4
As a matrix:

    [x1^3 + x1^2 + x1 + 1] [a] = [y1]
    [x2^3 + x2^2 + x2 + 1] [b] = [y2]
    [x3^3 + x3^2 + x3 + 1] [c] = [y3]
    [x4^3 + x4^2 + x4 + 1] [d] = [y4]
If the matrix is M, we just want to find M^(-1)y.

I think hackers should know enough linear algebra to do that, without "having to go asking the maths gods".


Edit: Should of course be without plusses in the matrix form:

    [x1^3, x1^2, x1, 1] [a] = [y1]
    [x2^3, x2^2, x2, 1] [b] = [y2]
    [x3^3, x3^2, x3, 1] [c] = [y3]
    [x4^3, x4^2, x4, 1] [d] = [y4]


If there is a way of doing it without inversion it's probably more accurate - some people go out of their way to formulate methods without inversion to reduce inaccuracies.


For sure. I just wanted to point out the approach that arguably required "the least math knowledge".

Also, OP was solving a degree four problem and wanted a closed form.

For better stability just use `x = np.linalg.solve(M, y)`, or whichever linear solver you expect to be fast.


See also (M^T M)^-1 M^T


Sure, if you have more equations the degree, but that doesn't seem to be the case in the article.


I will probably get down voted here, but in my perspective the author bitterness towards mathematics ends up doing the opposite of his supposed objective of "demystifying" the subject. It pushes "hackers" and "maths" even further apart.


Polynomial coefficients of an interpolating polynomial are poorly conditioned. Instead you should evaluate in the barycentric form directly or use alternate forms of the polynomial.


Great article, and I've been applying this recently in a few unrelated projects. Of note, research on the internet produces some surprising red herrings that make this seem more complicated than it is. This article's approach, by contrast, is nice and apt. Spline? Lagrange polynomials? Newton polynomials? etc. This gets especially messy when looking for interpolation in 2 and 3D.

I came to the same conclusion as the article: The answer is to solve a system of linear equations with 4 unknowns.


Interesting take on y = a + x(b + x(c + x(d))) which is the factorized form of any cubic function. You're right, you don't have to be the god of math to do this.


How do you prevent overfitting?


You don't! These techniques are for getting a perfect fit.


The parent asked regarding overfitting, because fitting a polynomial to a finite set of given data points can be viewed as an instance of a machine learning problem (inducing a regressor or classifier). What is called "control point"in the OP would be called the "training data". Now whether polynomials are a good idea for any given problem is a matter of the nature of the data, but overfitting in this context means that the polynomial induced via the parameters A, B, C and D can generalize to further (unseen) data points or not.

In the gaming context this is not a relevant question because, as you rightly say, the problem is "just to fit" what we have.

But in scenarios where the polynomial is an intensional representation for a function specified extensionally, i.e. via as a set of finite data points, and where we gradually become aware of additional data points, we can use the polynomial to forecast further data points - the interpolated ones.

Recent example paper: https://arxiv.org/pdf/1808.03216.pdf


The polynomial will go through your samples exactly, but you might have arbitrarily high error between the samples ( runges phenomenon), if your data is noisy at all it is much better to fit a low order polynomial to your data that minimises error. If your curve has no noise and you are free to choose the sample locations , minimax is the best way to guarantee minimum error between samples.


Indeed. Computing the exact polynomial going through all noisy points also doesn’t make sense because adding or removing a data point wildly changes the polynomial. So, you can’t claim any of your polynomials really describes your problem.

Even if your data isn’t noisy and the point added lies exactly on the previous polynomial (making the problem degenerate) chances are float rounding and numerical stability of your code can have that effect (https://arachnoid.com/polysolve/#x_limits)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: