Hacker News new | past | comments | ask | show | jobs | submit login

Hmm, do you mean N+K+1 (to have enough points for both the data and parity shards)? Why isn't N+K sufficient, fitting a polynomial to N points and emitting K more?



Using the notation from the article, N+K is sufficient for RS(N,K). One point of confusion is that different authors use different notation; some use RS(num data shards, num parity shards), some use RS(total num shards, num data shards), and some use RS(total num shards, num parity shards). Per the article, I'll use RS(num data shards, num parity shards).

As for where the +1 comes from, the clue is in the "noting that you shouldn't use the value 0 in the encoding matrix" remark. The TLDR is that the +1 isn't required, and arises from an (incorrect) attempt to fix an incorrect construction. The non-TLDR is rather long. First, we need to move from polynomials (per the article) to matrices (per the quoted remark). For this movement, let F denote the polynomial from the article, then it so happens that F(k+1) can be expressed as a linear combination of F(1), F(2), ..., F(k). Similarly, F(k+2) can be expressed as a linear combination of F(1), F(2), ..., F(k). This continues to be true up to F(k+t) (and beyond). These various linear combinations can be written as a k-by-t matrix, which is what the quoted remark means by "encoding matrix". Second, once thinking with matrices rather than with polynomials, people want to construct the encoding matrix directly, rather than deriving it from a polynomial. In this direct construction, the requirement is that every square submatrix (of any size) of the k-by-t matrix is invertible. Accordingly, no element of the k-by-t matrix can be zero, as the 1-by-1 submatrix containing just that zero element isn't invertible. Third, one common (but incorrect) direct construction is to create a k-by-t Vandermonde matrix. Such a matrix is usually constructed from some number of distinct elements, but if zero is used as such an element, then the resultant matrix will contain zeroes, which is problematic. Excluding zero causes the Vandermonde construction to _sometimes_ work, but it still doesn't _always_ work. Per https://www.corsix.org/content/reed-solomon-for-software-rai..., there's a slightly different Vandermonde construction that _does_ always work, and also a Cauchy construction that always works, both of which work for zero. Both of these correct constructions have a strong parallel to the polynomial construction: they involve choosing N+K distinct field elements, which is akin to choosing the N+K distinct x co-ordinates for the polynomial.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: