Introduction to Reed-Solomon and Berlekamp-Welch

kdkeyser · on Aug 2, 2017

Fun to read, well written, but I am not a big fan of the "oversampling" based explanation, as you start to think about functions on real values that way, while it is really finite field maths that underpins the working of these codes.

Also, Reed Solomon is not a particularly great forward error correcting code. They are optimal as "erasure codes" (i.e, you can lose some data, but the remaining data cannot have changed), but as error correcting codes, LDPC codes / Turbo codes outperform them a lot: e.g. satellite communication (DVB-S2), hard drive error correction all switched to LDPC's.

speleo_engr · on Aug 3, 2017

I thought Reed-Solomon also worked with a mix of erasures and errors? I wasn't aware of a requirement that erasures can't be corrected in the presence of errors unless you have hit the erasure limit for your chosen code.

kdkeyser · on Aug 3, 2017

For an n+k Reed Solomon code, you can recover from k erasures, but only from k/2 errors.

nullc · on Aug 3, 2017

This is inherent. A code can only unambiguously correct errors at half the distance of the code.

Visualize a lattice of possible input words: https://people.xiph.org/~greg/temp/lattice.png lets say that we are using a code where the valid codewords (dark dots) are at least 4 moves apart. If you use this code for error correction, you'll take a received word and move to the nearest codeword, but this is only unambiguous if you correct errors at up to half the distance between any two words.

RS codes are optimal in that they achieve the best possible distance for their size (they are MDS codes).

They may not be what you want to use in some applications because of the computational complexity of decoding with errors, or because the application can make use of list decoding which can be more efficiently achieved with other codes, or because they need different sizes in particular, the symbol size constraint with RS is burdensome for many applications.

The symbol size issue is a particular driver for other choices. Say your HDD codes data with 4-bit symbols but to get good performance you want a code that is a whole sector wide. If you use a 10 bit RS code to code in blocks of 10240 bits, then a single 4-bit symbol error potentially corrupts 20 bits of input.

Codes with large input symbols also don't lend themselves well to soft-input. (because instead of propagating around a single 1 or 0 probability you need to propagate around a probability distribution function equal in size to the field). So they're generally much less attractive for noisy channel coding since modems can easily give soft-outputs.

speleo_engr · on Aug 3, 2017

Right, so could you recover from k/2 erasures with k/4 errors?

nullc · on Aug 3, 2017

you can unambiguously recover from floor((k-e)/2) errors in the presence of e erasures.

Tomte · on Aug 3, 2017

Is there somehing akin to parchive that used LDPCs?

brian-armstrong · on Aug 2, 2017

Nice. There are quite a few implementations of R-S floating around out there

I'll plug mine, too, written in C. https://github.com/quiet/libcorrect

Learning enough about finite fields to implement one is really mind bending. Definitely recommend people try it, or at least make the encoder side

nickcw · on Aug 2, 2017

It takes me back! I wrote my first one in Z80 assembler a looong time ago as part of my first real job (~1990) which was used to error correct a satellite channel (bandwidth about 19.2 Kbit/s IIRC)!

It used lots of table lookups to make things run at a reasonable speed, but the tables couldn't be too big since there was only an 8K EEPROM. The Z80 code only did d=3 codes (so could correct 1 error or detect two).

Working out how to solve a quadratic equation over GF(256) in Z80 assembler was my personal highlight!

fellellor · on Aug 3, 2017

Any pointers on how the code is organized?

mcphage · on Aug 2, 2017

Fun mathematical aside: if you have a polynomial with positive integral coefficients, then you can uniquely determine it with only two points, no matter the degree.

ghusbands · on Aug 2, 2017

For those who want the trick: https://www.johndcook.com/blog/2012/03/27/polynomial-trick/

zokier · on Aug 2, 2017

Okay, that took some thinking to figure why it works. Basically, p(x) tells the coefficients as long as x is bigger than any coefficient. This is easy to test with the given example polynomial p(x) = 2 x³ + 4 x + 3, p(10) = 2043, where you can see the coefficients plainly in base 10. But for example p(x) = 2x³ + 14x + 3 is ambiguous with p(10) = 2143. (edit: I feel like there is a connection to Nyquist-Shannon theorem here, the coefficients get kinda "aliased")

The trick here is that because p(1) is the sum of the coefficients, it must also be bigger than any of coefficients, so p(p(1)) is always "safe" to use.

To avoid base conversions, I think you can always use next power of ten for the second value, i.e. p(10^n) where n = floor(log10(p(1)))+1. For example if p(1) = 182, then you could ask p(1000) = 12000140030 or 12 000 140 030, which gives us the coefficients (12x³ + 140x + 30). Arguably easier than trying figure out what 72368326 is in base 182.

mcphage · on Aug 3, 2017

Yeah, I like the 10^n method, since it looks like you get the coefficients written out very nicely.

huhtenberg · on Aug 2, 2017

Ha... the proof is so elegant!

dmitrygr · on Aug 2, 2017

try that with f(x) = x^8 - x^4

EDIT: i cannot read. please ignore :)

jjnoakes · on Aug 2, 2017

One of your coefficients is not non-negative.

dmitrygr · on Aug 2, 2017

you're right

nayuki · on Aug 3, 2017

An explanation of Reed-Solomon ECC decoding, using field arithmetic and linear algebra, with runnable code: https://www.nayuki.io/page/reed-solomon-error-correcting-cod...

0xdeadbeefbabe · on Aug 2, 2017

> Your MP3 maybe got knocked down, but it got up again. You’d need to lose more than 4 pieces to keep it down.

Nice

yellowapple · on Aug 3, 2017

I felt like the whole article was just a setup for that joke.

jtolds · on Aug 4, 2017

it's easily the thing I'm most proud of

nayuki · on Aug 3, 2017

A reference to Tubthumping, eh

dmitrygr · on Aug 2, 2017

  high-performance [...] Reed-Solomon

RS has not been the state of the art since, well, at least since BCH happened

nullc · on Aug 3, 2017

RS codes are a sub-class of BCH codes, where the subfield and field are the same.

RS codes, where they exist, achieve greater distance than BCH codes... but they have a far more restricted set of parameters than BCH codes.

Many applications are moving to things other than BCH codes because even though they are a much better class than RS codes they are still limited and the existing BCH codes at many sizes are not as good as other kinds of codes.

tobz · on Aug 2, 2017

Just had to say that this is probably the best explanation of Reed-Solomon I've seen: practical explanations, useful visuals, good pacing, just enough humor. :)

je42 · on Aug 3, 2017

fun stuff. wrote once a decoder from reading data tracks from cd in raw mode :)