Adventures of developing an MP3 decoder in Python

banditaras · on Oct 3, 2010

A long, long time ago I coded an MP3 implementation for a TI Fixed Point 16bit DSP@100MHz (family 5XXX IIRC).

I initially wrote it in floating point C (compiler translated float to fixed point) and it was damn slow. I then rewrote everything to use fixed point arithmetic only and dropped down to about half a second of decoding for a second of data. That wasn't still good enough because the DSP had to take care some other stuff like ide drives, dma, fat, dac, usb , display drivers etc, so I rewrote all critical parts in assembly taking advantage of the chip's build in dsp functionality and dropped to 100ms of decoding for 1s of data.

So it might be helpful to start from a pure python implementation then move your way down, you'll still learn a lot.

stavros · on Oct 4, 2010

Since you've done it, can you tell me a bit about the caveats of converting from floating point to fixed point arithmetic? It would seem that the only thing you'd need to do is multiply by whatever doesn't make you run out of space, so, as someone who hasn't actually tried it, I'm wondering what the hard part is (apart from the obvious loss of accuracy when transitioning from floats to ints).

mryall · on Oct 3, 2010

The bit that cracked me up:

> I finished the decoder and it can produce wav file output from an mp3. The good news is that is works, the bad news is that its 34 times to slow to play the audio runtime.

I think it's great to try learning something in a language that makes experimentation easy. Looking forward to see how he goes with the optimisation.

noonespecial · on Oct 3, 2010

My first attempt to write a decoder was done on a "Cyrix" 586 processor at 90 Mhz. (The kind that fit in a 486 motherboard and let you pretend you had a pentium.) I wrote it in pascal. It did just a little better than 1.5 weeks per minute of audio. I was thrilled to no end that I'd made it work at all.

My hats off to him.

stavros · on Oct 4, 2010

Wait, if it took a day per second, how did you even know it worked? :P

noonespecial · on Oct 4, 2010

I downloaded "Hotel California" via FTP from tamu.edu, set it for somewhere about the middle and did 10 seconds. I recognized what I heard and was thrilled.

masklinn · on Oct 4, 2010

Left it running for days on short samples? (also 1.5 weeks for 60 seconds would be ~6 seconds per day, not one).

stavros · on Oct 4, 2010

Huh, this is true. I wonder where my math went, there.

acqq · on Oct 3, 2010

I don't know why trying to reach real-time speeds with Python. At the moment it's really a slow language. According to my measurements is JavaScript at the moment the fastest "typeless" scripting language: the engine in Firefox can be up to 200 times faster than Python. There's also a command-line version.

cdavid · on Oct 3, 2010

But javascript does not have something like numpy. You can easily get one to two orders of magnitude faster using numpy when applicable, and a MP3 decode certainly falls in that category.

acqq · on Oct 3, 2010

It's true that Numpy is much faster. I believe there's a lot of native code in Numpy and once we consider the libraries encoded in C, any language can be as fast as any other, as soon as we can offload enough to the libraries (for example, Perl has also some official C extension mechanisms: http://perldoc.perl.org/h2xs.html and even old Visual Basic was able to call DLL's written in C).

However, I'd really like if at least something that's now a part of Numpy (which I believe is now even more a language extension than a library) would become the part of the Python language.

cdavid · on Oct 4, 2010

Numpy does have a lot of C code compared to most python packages, but I don't think that's the right way to look at it. If numpy was only about something being in C instead of python, it would not be that interesting.

But numpy brings you abstractions on top of python which are useful and even more high level than python, and that's what is interesting: abstractions which do make your code faster (when applicable, of course).

From the blog's profiling posts, a lot of time seems to be spent into MDCT and polyphase decoding: both could be coded quite efficiently in numpy (I hope to add MDCT itself in scipy, actually).

About being part of python: I don't think it would be a right move (and has been given up from both python and numpy), because numpy is quite big, and depends on quite hairy dependencies if you care about speed. I don't think it is fair to say that numpy is a language extension - numpy feels more natural than say twisted from a python integration POV. But I am a numpy developer, so maybe not the best judge.

acqq · on Oct 4, 2010

> I don't think it would be a right move because numpy is quite big, and depends on quite hairy dependencies if you care about speed.

Maybe having some default build to have the main functionality and have some way to use the hairiy dependencies otherwise? As you say:

> Numpy brings you abstractions (...) which are useful and even more high level than python,

That's what's a loss for Python as the language. I don't expect to have the access to every FORTRAN-coded scientific library routine in Python by default, but being able to use the right abstractions in the langauge and not through something like the patch... maybe that can happen once?

I understand, of course, that the current state is easier for you and for Python developers.

pcwalton · on Oct 3, 2010

Flusspferd (http://flusspferd.org/) has GMP bindings, and there's no reason that node.js (for example) couldn't be hooked up to GMP as well.

judofyr · on Oct 3, 2010

Well, the fastest scripting language must be LuaJIT?

acqq · on Oct 3, 2010

If you refer to http://shootout.alioth.debian.org/u32/luajit.php my answer is: maybe. As far as I understand the JavaScript benchmarks there were made before the most recent speed improvements (JaegerMonkey, present in Firefox 4 beta). In my simple tests, the speed in some loops can reach the speed of C, with the times better than LuaJIT.

Still, LuaJIT is easier to obtain and run, as far as I know.

igouy · on Oct 4, 2010

Downloaded, installed and re-measured today. http://shootout.alioth.debian.org/u32/benchmark.php?test=all...

acqq · on Oct 4, 2010

Thanks! LuaJIT really rules in the shootout.

The tests I've done were only inside of Firefox 4 Beta and I've got 3.4 sec for 5M n-body on my slightly (30%) faster computer. I remember that traditionally the standalone builds of Mozilla JS were not giving the same results as the one in the browser.

A message for Mozilla devs, in case anybody reads: be aware that people do try to measure how fast JS is from the command line and let them get the really last results! And use http://shootout.alioth.debian.org if you want to fundamentally improve your implementation of JS.

igouy · on Oct 4, 2010

Note the main n-body measurement is N=50,000,000. The source code was downloaded from the tracemonkey branch https://wiki.mozilla.org/JavaScript:New_to_SpiderMonkey.

They can do better than use the benchmarks game - Kraken is one attempt they are making to do better. http://krakenbenchmark.mozilla.com/index.html

acqq · on Oct 4, 2010

Regarding sources, I know this:

http://news.ycombinator.com/item?id=1754881

The benefits of the benchmarks game: allows comparison with other languages (like Lua) whereas as far as I understand their Kraken is JS only and browser only. Thanks a lot for your good work!

igouy · on Oct 3, 2010

You can help by providing clear instructions on how to build the latest JavaScript engine.

acqq · on Oct 4, 2010

I'd also like to know. At the time as first JIT functioned in the browser I wasn't able to download the sources which really did JIT on Linux in the command line. I wouldn't be surprised that now again the latest speedup in the browser is not repeatable in the command line for most of the people.

sqrt17 · on Oct 3, 2010

It's probably a stupid question, but: what's the advantage to coding everything in Python? If you're targeting the N900, Psyco likely won't help you, and there's probably some existing MP3 library that one can wrap (libmpg123)?

Also: you might want to try Cython, it compiles a Python-like language to (reasonably platform-independent) C or C++ code, with very good Python interoperability.

masklinn · on Oct 3, 2010

You'd have known if you had but simply scrolled to the introductory post:

> I’m going to write a python based MP3 decoder. Yeah, its a bad idea from a performance standpoint, but from a learning/documenting perspective, it is quick and easy.

sqrt17 · on Oct 4, 2010

Ok. But then I don't understand the squirmyness about using numpy (and possibly scipy). Not using numpy in such cases will always give you less-than-decent performance.

masklinn · on Oct 4, 2010

That it will very likely make code less clear and readable, though it would probably be a necessary step to reaching realtime performance on his decoder.

acdha · on Oct 3, 2010

It's great for learning and playing with algorithms, particularly since you would save so much time which would otherwise be consumed with low-level details that you'd have a much better understanding of the overall program function when the time came to optimize. There's a real forest-for-the-trees problem optimizing something like C first.

extension · on Oct 3, 2010

If I had to implement something as sophisticated and novel (to me) as an mp3 decoder, I would first prototype it in my favorite dynamic language and then port it to something fast.