Hacker News new | past | comments | ask | show | jobs | submit login

Strassen is reasonably numerically stable (although not as good as the naive algorithm), and everything beyond Strassen is extremely unstable. This is due to a trick that many of these algorithms employ: you technically only need to do multiplication of 0-1 matrices to fully solve the matrix multiplication problem in theory, and so having an algorithm which computes the entries of the matrix with additive error 0.1 is sufficient to exactly solve the problem (just round entries to the nearest integer). As you can imagine, this means your algorithm can only give O(1) bits of precision unless you employ this reduction to the 0-1 case first.

To understand why this even happens, let's say we want to compute the expression A*B + B*A for matrices A and B. One way we can do this is to compute the products A*B and B*A naively: two multiplications are needed. A trickier way to do this is to introduce a parameter x: we will instead compute A*A and (x*A + B/x)*(x*A + B/x) = x^2 A*A + (A*B + B*A) + x^-2 B*B. Thus (x*A + B/x)*(x*A + B/x) - x^2 A*A = (A*B + B*A) + x^-2 B*B: if x is large enough the second term vanishes and we can employ the rounding trick from before. Now this still needed two multiplications, but here one of our multiplications was A*A. If later we needed to compute A*C + C*A in the algorithm, we could then do that in only 1 additional matrix multiplication by repeating the trick. A more sophisticated version of this algorithm underlies all known approaches for matrix multiplication beyond w << 2.8.




Thank you for the very nice explanation!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: