Hacker News new | past | comments | ask | show | jobs | submit login
Fast Fibonacci numbers using Monoids (haskellforall.com)
144 points by lelf on April 23, 2020 | hide | past | favorite | 40 comments



Exponentiation-by-squaring is usually phrased as the problem of computing X^n, but its actually usually more natural to ask about X^n Y. Unlike finding X^n directly, this reduces directly to another instance of the same problem.

  X^n Y = (X^2)^(n/2) Y if n is even
  X^n Y = X^(n-1) (XY) if n is odd
So you don't really need to take a full matrix-matrix product; you only need to square a matrix, X^2, and take a matrix-vector product, XY.

In the case of Fibonacci matrices, they can be represented with their first column only since they have the form

  A  B
  B A+B
so these are just

  square (A, B) = (AA+BB, 2AB+BB)
  mulVec (A, B) (u, v) = (Au+Bv, Av+B(u+v))


> You can more easily generalize this solution to other arithmetic sequences.

Fun trick, if you ever find yourself in a situation like this - instead of calculating the matrix by hand (a tedious chore), you can use an algorithm called Berlekamp Massey on the terms of your sequence and then find the k-th term in O(N log N log K) time instead of O(N^3 log K) time (like matrix exponentiation would).


Here's a nice blog post about the algorithm https://codeforces.com/blog/entry/61306 .

It definitely makes sense, recurrence relation does have its own polynomial. Even the naive polynomial multiplication is very fast.

Now, what I'm wondering is if this is possible to do with adjacency matrices of graphs.

Recently I was calculating the number of paths of a knight going from one position to another in K moves. Ended up doing O(N^6 log K) (N is the chessboard dimension) which was fast enough for a small chessboard but maybe there's a way to turn that adjacency matrix to a polynomial and do the same.


Yes you can.

Cayley Hamilton theorem gives a linear relation between A^N and the terms less than that.

So for your problem, you can calculate the number of paths from A->B of length k in k*N^2 time, you need O(N^3) to compute enough terms for Berlekamp Massey, O(N^2) to compute Berlekamp Massey, and then O(n log n log k) to compute the k-th linear recurrence term with FFT.

I've actually considered writing a blog post about this :) It's a cool example of how much further you can go with a problem where many people only know the O(NK)DP solution.


Please write a blog post!


On an unbounded chessboard it's not too difficult to represent the movement of a knight by a polynomial (or really a 2D convolution), but when your chessboard is bounded then the boundary effects become annoying. Maybe there is some way to mitigate those but I'm not aware of any.


… in fact, you can easily compute up to f(10^8) in a couple of seconds using this code (not shown, because the result takes far longer to print than to compute).

Is this actually this fast or did the author get fooled by lazy evaluation or something along that line? For the fun of it I tried to come up with an algorithm in C# that works more or less like the matrix exponentiation one but without explicitly using matrix exponentiation. When finished it failed to terminate for large numbers and I eventually estimated that it would take about six hours to find the 100 millionth Fibonacci number with this code on my laptop. I don't see why this would be so much slower than a straight forward implementation of the matrix exponentiation algorithm, where is the big difference if any? Or is the .NET big integer implementation that slow? Or did the author get fooled after all?

  private static BigInteger Fibonacci(UInt32 n)
  {
    if (n == 0)
    {
      return BigInteger.Zero;
    }
    else
    {
      var (a, b) = (BigInteger.Zero, BigInteger.One);

      var mask = 1u << (Int32)Math.Floor(Math.Log(n, 2));

      while ((mask >>= 1) != 0)
      {
          var s = a + b;

          var (a2, b2, s2) = (a * a, b * b, s * s);

          (a, b) = ((n & mask) == 0) ? (a2 + b2, s2 - a2) : (s2 - a2, s2 + b2);
      }

      return b;
    }
  }


I transcribed your code into Python and it took about 90s to do fib(10[star][star]8) on my T420. Language doesn't matter of course, this is just an exercise of the bignum library.

Nice idea to use s.


Okay, so the .NET big integer implementation is like two orders of magnitude slower than it could be? I will have a look into that, I always assumed that all big integer libraries are more or less comparable in performance. Thanks for the hint.

EDIT: I found an issue [1] concerning .NET big integer performance, it seems it is only using pretty basic and relatively slow algorithms, especially for multiplication which is the relevant operation here.

[1] https://github.com/dotnet/runtime/issues/14407


To do f(10^8) the OP says his/her code would take the matrix and square it about 26 times (2^26 = 67 million and change), then do some more in other ways. That's where the speedup comes from. Are you also doing that?

(The mechanism for the speedup is described in passing, in a little note about the Haskell library function he's using to calculate a power of the matrix. For whatever it's worth this trick is excellent, but it dates from before computers were invented.)


Yes, as you can see in the code, the loop executes once for each bit to the right of the leftmost set bit or equivalently ⌊log₂(n)⌋ times. Naively this suggest a runtime of O(log(n)) but the numbers become huge pretty quickly and then all the arithmetic operations are far from O(1). The actual runtime is probably more like O(n²) or O(n² log(n)) after one accounts for the runtime of the big integer operations.


The blog post author abandons the closed form solution of (φ^n - ψ^n) / (φ - ψ) citing loss of accuracy to demonstrate the pretty neat matrix solution.

The closed form solution is not quite a dead end. The floating point errors can be avoided if one works in the field of number of the form a + b √5 where a and b are rational.

All one needs to is to maintain the numbers a and b separately. The same doubling trick applies for exponentiation.


This is equivalent to the matrix solution. The matrix has eigenvalues φ and ψ, so working with linear combinations of powers of the matrix is the same as working over Q(φ, ψ) = Q(√5). This is not just an equivalence abstractly mathematically, but computationally the solutions are identical and run within a constant factor of each other.


I believe it's equivalent to working in a basis that diagonalizes the matrix rather than working in the standard one. Much easier to work with.


Indeed (φ^n - ψ^n) / (φ - ψ) is what you'd get if you diagonalize the matrix.

However the field you'd be working over is Q[√5], although really we're interested in the subring Z[φ] here. And since φ is (trivially) an eigenvalue of the matrix, let's call it M, this is equivalent to calculating in Z[M] by the Cayley-Hamilton theorem.

Of course some of these constructions might be easier to work with, and certainly something like Z[x]/(x^2-x-1) allows for fairly straightforward calculation, but you're still doing the same multiplications essentially.

And if you're working in a convenient ring you don't really need the ψ^n term, since you can just use that φ^n = F_{n-1} + F_{n} φ.


Very neat. Picking up the exponentiation by squaring algorithm for free by implementing Semigroup is a really nice touch.

Since you're defining your own custom Matrix2x2 class and <> operation anyway, you can save a little space and computation by relying on the fact that every Fibonacci matrix is of the form: [ a+b, a; a, b ]. Thus only two member variables are needed, and <> can be implemented with fewer multiplications.


The essence of using Exponentiation by Squaring to calculate Fibonacci sequences in log(N) time is also covered in SICP exercise 1.19.

The elegance of the Haskell solution is that you only need to define a Monoid, and then you get exponentiation by squaring for free. This is a great example of abstraction done right.


I see monoids more and more everyday. A bit outside the topic but does anyone have a good introduction-level resource to learning monoids?


There is not much to learn, a monoid is a very simple abstraction.

Think for example about learning about the "iterator" abstraction. It's just something with "hasNext" and "next" operations, nothing more to understand. But when you know it you start seeing iterators everywhere, and you get frustrated when something that should be an iterator isn't so you cannot write a simple foreach loop. And you'll start implementing iterators in your own code, because you know that provides a lot of convenience for free to consumers.

Same thing for monoids. They're everywhere, they offer some convenience for free, and you'll be wondering why things that should have monoid instances don't.


A monoid is an algebraic data structure that has a binary operation (+, *, ++, etc), and an empty element (0, 1, [], etc). In terms of relationship to other algebraic data structure, it "inherits" the binary operation from its Semigroup buddy, which is a very nice data structure to represent non-empty containers (and statically enforce that property).

Julie Moronuki's blog has helped me greatly to understand it all: https://argumatronic.com/posts/2019-06-21-algebra-cheatsheet...


It depends on what you mean by introduction. I think the Typeclassopedia entry on Semigroups and Monoids is pretty good, if you are familiar with basic Haskell syntax.

https://wiki.haskell.org/Typeclassopedia#Semigroup


Thanks for the link.

I am not good with Haskell but I meant in general mathematical / algebraic introduction, explanation or a lecture.


In abstract algebra there really is a hierarchy of group-like-structures with different levels of symmetry (see the table in [1]), and a monoid is one of such structures.

I've had some luck with the first chapter of this (free) [2] book, although they explain groups and not monoids but for me it was then rather quick to gain intuition about monoids.

[1] https://en.wikipedia.org/wiki/Monoid#Relation_to_category_th...

[2] https://web.evanchen.cc/napkin.html



He also has lectures on YouTube which are great.


Side note: the study of monoids isn't a very big field in mathematics, because they're too simple to say too much about them. But if you modify them just a bit, either in the direction of groups or in the direction of categories, you get massive fields of study.


I'm reminded of John Baez writing recently (https://twitter.com/johncarlosbaez/status/124647011232195379...) about commutative semigroups. Even simpler, but still lots to say!


I probably still stand corrected, but commutativity is a much stronger condition than the existence of an identity!


I think the existence of an identity isn't very important. You can adjoin an identity to any semigroup to turn it into a monoid.

In any case, doesn't adding more properties make a theory simpler? For example the classification of finite simple groups is much harder than the classification of finite simple commutative groups.


Maybe "A first course in Abstract Algebra" by Fraleigh available on Kindle for less than $10. https://www.amazon.com/dp/0201763907

(It's not focused on Monoids. It also covers groups, rings, and fields from a mathematical point of view. It's written for an undergraduate math major with no prior experience in these areas.)


Thank you everyone for all the links. This is very helpful!


No introduction necessary, monads are just monoids in the category of endofunctors.


Wrong way around. They want to know what monoids are. So you have to tell them that monoids are just modules of the list monad. ;-)


Or that a monoid is merely a category with exactly one object.


> You can more easily generalize this solution to other arithmetic sequences.

The closed form as sums of exponents also easily generalizes. You can find the roots of an associated polynomial, and the those roots are the numbers you take powers of. This is explained on Wikipedia here: https://en.wikipedia.org/wiki/Recurrence_relation#Solving_ho....


As an aside, a lovely (freely available) book on using generating functions to calculate closed form solutions to such problems by the late Herbert Wilf: https://www.math.upenn.edu/~wilf/DownldGF.html

One of the more enjoyable maths texts I've read.


Nothing to add to this except to say that Generatingfunctionology is an awesome book. Although I haven't read it, another in the same space is A=B, also co-authored by Wilf (https://www.math.upenn.edu/~wilf/AeqB.html) (another co-author is Zeilberger, whose name anyone interested in computational discrete math will recognise).


I've spent some time on A = B as well. Very pleased to find someone else interested in these things on here!


Several years ago I hacked together a little tool for exploring integer sequences that can be generated via matrix-vector products:

http://dpw.me/clear-the-deck/

Sadly it's not as functional as it once was (sequences are supposed to continue generating as you scroll down the page). If anyone's interested in lending a hand to get it working properly again, I'd love to hear from you! My email is in my profile.


I wrote a blog post about it here at the time:

https://aperiodical.com/2014/06/discovering-integer-sequence...




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: