I've always wondered how services like Shazam work. I'm amazed that they can do ...

stevetjoa · on June 3, 2011

I do research in music information retrieval. See the ISMIR 2003 paper below. In short, it searches for landmarks in the spectrogram, hashes those landmarks, then compares those hashes against database hashes for temporal continuity. http://www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf

A seminal paper on audio fingerprinting is the one by Haitsma and Kalker. http://ismir2002.ismir.net/proceedings/02-fp04-2.pdf

starkdg · on June 3, 2011

There's a gpl implementation of an audio information retrieval approach here: http://code.google.comp/audioscout/

FlowerPower · on June 3, 2011

Thanks!

maurits · on June 3, 2011

A Matlab implementation and tutorial to give you an idea can be found here: http://labrosa.ee.columbia.edu/~dpwe/resources/matlab/finger...

T_S_ · on June 3, 2011

Yes the delta part is clever. Shazam is well described here: www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf

kenjackson · on June 3, 2011

For pop/rock/rap music it probably doesn't matter. For classical, I'm not sure.

thesz · on June 3, 2011

I think you just suggested that classical music is all the same.

kenjackson · on June 3, 2011

I meant to suggest the near opposite. That with pop/rock/rap, a 10s chunk is enough to get a signature for the wole piece. While for classical, the music changes enough that any 10s chunk may not respesent adequately a different 10s chunk.

zackattack · on June 3, 2011

There was a cool HN article a while back on building your own Shazam clone in Java.. I think the author received a C&D..

Found it! http://www.redcode.nl/blog/2010/06/creating-shazam-in-java/