It's surprisingly easy to do. I know a company that has a bank of servers listening to streaming radio stations all day and night, fingerprinting the songs and taking the track and artist names from the stream to create an in-house Shazam.
I also happen to know the company you’re talking about. However, what they’re doing is way more complicated than a simple Fourier transform, it’s not easy to do at all.
Crawl public facebook videos, automatically figure out what they were watching/listening to in the background, and correlate that with their public demographic information.
Radio stations use this kind of technology to keep track of the amount commercial content is played which gives then gives them a indication on royalties they need to pay.
shazam's value is obviously in how it scaled this method to millions of users and songs but implementing it for yourself on a limited catalog of songs is a couple days of work once you have the theory. in fact this was a lab project for the intro signal processing class at berkeley that i ta'd.
Thanks for the link. I thought these things involved a lot of markov models and gaussian functions but this is mostly just some pretty slick engineering. The 1000x search speed seems very good.
A few others have posted references. We did it as a lab in my undergrad signal processing class. Theory is simple: transform time into frequency domain and you have a fingerprint of the song. I guess the value is in the execution, a la Shazam.
Transform from time into frequency domain and you have the Fourier transform of the song, not a fingerprint. Fingerprinting is a lot more involved than that.
So now you’ve fingerprinted an entire song. How do you go about matching that to the 7 seconds of music and loud background noise I actually present you with?
Execution is everything and I have found that the ondevice index shipping with pixel devices displaying the title of the currently playing song on the lockscreen was such a better implementation than having to hurry up and remember where the shazam app is, open it, and try to identify the song (optionally having to try several times and store the sample for later when I have more network).
the reason i am being downvoted is because hacker news is fucking garbage now. too many insecure, underqualified jackasses looking for a reason to argue.
saying the algorithm "wouldn't work" without an app is like saying my door doesn't work without a doorknob.
OP made a point: Shazam is not magic, and people have the opportunity to compete or DIY. to me, it seems simple/contrarian/not helpful to respond with "they have an app too"
The point is that saying Shazam is just a button ignores the huge amount of effort that sits behind it. Not just the technology but more importantly the sales and marketing to get to the number of users they have.
Amazon or Youtube are not magic either. But good luck getting to that same scale without significant effort.
and saying "they have an app" is ignoring OP's actual point, which is all i am trying to defend. no one is calling into question the merit it's taken to build Shazam as a company.
how do i find myself defending the most ridiculous shit here?... like an algorithm can't work without a UI.
> guy 1: "index the Fourier Transform of all songs"
> guy 2: "[don't forget about] their huge database[!]"
how is this not contrarian? biz guy spotted? the dude straight up fucking said to build a database of "ALL SONGS". if you think that's simple you're either not thinking hard enough or you have no idea what he's talking about.
Google search is more than just indexing the content of all web pages though. It's also getting people to the result they're searching for which is significantly more complex.
That said, it's still a feature which is why Google makes it's money off of advertising instead. That's the real Google product in relation to their search feature.
Google Search is not a feature. It's a product. They have product owners, product managers, product roadmaps etc and it's fundamental to what Google is as a company. They themselves call it a product: https://www.google.com/intl/en/about/products/
And we can play this game all day if you like. By your ridiculous logic: Spotify, Facebook, WhatsApp, Amazon, Youtube are all just features.
Sure, there's a complexity scale where we arbitrarily consider something a "product" instead of a "feature". In fact the scale can slide as technology improves. Something that was previously complicated enough to consider a product may now be trivial enough to become a feature.