Hacker News new | past | comments | ask | show | jobs | submit login

I once used a modified version of Echoprint to fingerprint a few million tracks from a music service we were working on. Most fun bit was maxing out 80 cores and a LAN segment using a mix of Celery workers to fetch tracks and feed them to a C++ fingerprinter and store the data in Postgres.

The EP fingerprints were a lot smaller, though. IIRC it used a mix of beat and tonal detection.




Echoprint is wonderful for fingerprinting speed (C++), but the fingerprint size is actually smaller in Dejavu (binary(10) field in SQL for each fingerprint).

The other interesting differences to note are that Echoprint doesn't use a constellation fingerprinting approach along with offsets, and the fingerprinting is meant to be the same across all platforms / use cases so you can compare them.

As a direct result, you also can't get the offset in seconds that your query audio refers to like you can with Dejavu.

When I coded up this project, I wanted something that was more customizable - allowing you to decide the speed, number of fingerprints, size of the fingerprints etc to match your own false positive / memory / CPU requirements.

When you do, you sacrifice interoperability between all Dejavu index installations, but you gain that application specific performance. It of course depends on your use-case which library is better.


We modified EP to take multiple fingerprints to achieve the same result with offsets (down to 10 seconds, I think), and built a web UI prototype for matching audio from a desktop browser.

It didn't end up becoming a telco service solely due to commercial agreements, but it was a lot of fun and almost embarrassingly accurate with ABBA songs (since we ended up trying a lot of variations on the first entries in the catalogue).




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: