Let's build an MP3-decoder (2008)

Siemer · on March 16, 2018

Ever since reading "How music got free" by Stephen Witt I get a bit annoyed with the MPEG team getting credit for "creating" the mp3. They did more to kill it than to support it.

fwdpropaganda · on March 16, 2018

Link?

EDIT:

https://www.amazon.com/How-Music-Got-Free-Obsession/dp/01431...

https://www.nytimes.com/2015/07/26/books/review/how-music-go...

sp332 · on March 16, 2018

Now that the patents have expired, let's make an MP3 encoder!

... I have no idea how to do that. Even a toy one.

opencl · on March 16, 2018

http://mp3-tech.org/programmer/encoding.html

Source code for a lot of early encoders is available to study. Probably easier to understand than the modern implementations. LAME started out as a set of performance patches against the ISO sources until eventually being rewritten from scratch.

Toy MP3 encoders are not horrifically complicated, though ones that sound decent are. It's astonishing how much improvement there's been over the past few decades even with the same underlying format, modern 128kbps MP3s don't even make your ears bleed.

sp332 · on March 16, 2018

Thanks for the link. I think there's a lot of room for experimental, artistic, or even scientific specialty encoders. There's so much flexibility in picking out what's "important" in a signal.

exikyut · on March 18, 2018

The way you worded that immediately made me think of glitch art, particularly GIFs, and particularly situations where the glitch aligns perfectly with the content.

sp332 · on March 18, 2018

This is the post that got me thinking about it recently https://news.ycombinator.com/item?id=16034547

OskarS · on March 16, 2018

I generally do it like this:

1. Download liblame

2. Thank god and Richard Stallman for open source software

3. Have a cup of coffee.

cuckcuckspruce · on March 16, 2018

RMS would cringe at you calling it open source software instead of Free Software

OskarS · on March 16, 2018

That is true. I'm sorry, Richard Stallman, please forgive me! I always call it GNU/Linux, I promise!

boomlinde · on March 16, 2018

It's GNU plus Linux

jandrese · on March 16, 2018

Maybe, but then I have to get into that whole "no, I mean Free as in Speech, not Free as in beer" discussion again...

piqalq · on March 16, 2018

It’s a pain in the ass. I built one for a steganography thesis and the psychoacoustic model is really what made it difficult.

The psychoacoustic model took more time on its own to code than the PCM splitting, MDCT, windowing and Huffman coding.

It’s a fun project however painful. There’s a MP3 encoder from the early 90s floating around that you can use as a base if you can fix the legacy code. I can’t remember the name though, sorry.

rzzzt · on March 16, 2018

Take the input, split the bands, apply psychoacoustic model, quantize coefficients, apply Huffman coding on the output: https://books.google.hu/books?id=MN34-91z6qAC

(Coincidentally, these are the steps detailed in the article, in the opposite order.)

opencl · on March 16, 2018

The decoder doesn't care about the psychoacoustic model at all though, that's all up to the encoder and the decoder just gets some frequencies to play back. The psychoacoustic model is also by far the most complicated part of any decent modern encoder.

epmaybe · on March 16, 2018

Signal processing is a fascinating field that I wish I had explored further in college

jacobush · on March 16, 2018

Obviously the only way to do that is to throw a Recurring Neural Network at it? Am-I-rite or am-I-rite?

... snark, but not really? Todays hammer of choice is machine learning and AI.

petercooper · on March 16, 2018

I took a quick look and couldn't find anything obvious, but I'd be surprised if someone wasn't using machine learning to improve psychoacoustic models. A fundamental problem is modelling the subjective listener, I guess.

jandrese · on March 16, 2018

You could calculate the difference between the original and uncompressed and then weight the differences by sensitivity levels of the human ear at various frequencies...

It wouldn't be able to measure "warmth" or "heart" or "anger" or anything like that. It would be able to tell you which model most closely matches the original at the frequencies you can hear. Once you have a metric, you can use it as the basis of a parameter space search. Not exactly machine learning, but maybe some preliminaries.

jacobush · on March 20, 2018

What's also in vogue is to gather huge datasets, in a fashion only really available to Google and their ilk. I'd imagine a "click-on-all-vehicles"-affair, only with sound instead of images.

bluedino · on March 16, 2018

Here's an old list of Mp3 decoders:

http://mp3decoders.mp3-tech.org/decoders.html

It's amazing how terrible some of them were back then. I had probably 5 different players on my system because certain files would only play with certain players, encoding and decoding was such a mess back then.

akhilcacharya · on March 16, 2018

This is really cool! I've never seen an article that uses Haskell for a real-world task like this, are there any others?

jraph · on March 16, 2018

Pandoc (http://pandoc.org/) is written in Haskell and converts documents across formats.

aepiepaey · on March 16, 2018

Pandoc is great!

...except if you want to batch convert real-world markdown.

Though admittedly, that's mostly because markdown designers thought allowing embedding of arbitrary HTML was a good idea.

(The best you can get is by going: markdown -> HTML -> $target_markup.)

Tomte · on March 16, 2018

The ability to embed arbitrary HTML is an intentional feature of Markdown.

KMag · on March 16, 2018

aepiepaey agrees with you that it's an intentional design decision ("thought ... was a good idea"), and his/her use of past tense also implies that s/he disagrees with the decision.

jxub · on March 16, 2018

https://lettier.github.io/posts/2016-08-15-making-movie-mona... walks you through making a movie player in Haskell.

kefir_cultist · on March 16, 2018

Never ceases to amaze me how complicated digital audio processing seems compared to analogic. A very nice article —and blog —.

jxub · on March 16, 2018

Got any resources for analog signal processing?