Hacker News new | past | comments | ask | show | jobs | submit login

Shazam is a feature not a product: index the Fourier Transform of all songs.



It's surprisingly easy to do. I know a company that has a bank of servers listening to streaming radio stations all day and night, fingerprinting the songs and taking the track and artist names from the stream to create an in-house Shazam.


I also happen to know the company you’re talking about. However, what they’re doing is way more complicated than a simple Fourier transform, it’s not easy to do at all.


Must be a different company, then.


Would love to hear more about this and what is complicated and the type of problem they are solving. I have little experience with audio.



This is really great, thanks for sharing!


Just out of curiosity what need do they have for in-house Shazam?


Crawl public facebook videos, automatically figure out what they were watching/listening to in the background, and correlate that with their public demographic information.


Radio stations use this kind of technology to keep track of the amount commercial content is played which gives then gives them a indication on royalties they need to pay.


My company has shopped around for a 'shazam library' to integrate to our product.

We tried a dozen of these before finding one that was actually working reliably enough.


I bet the stream operators paying the bandwidth bills don't appreciate that too much.

What does this company need an "in-house Shazam" for?


Presumably he's talking about Mediabase, or Meltwater, or something along those lines.


If you can point to any blog post that shows how one would implement Shazam, I'd love to read it.

I read somewhere that the Pixel 2's on-device index of 10,000 popular songs was only ~60MB. That blew my mind.


the original paper is quite readable: https://www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf

shazam's value is obviously in how it scaled this method to millions of users and songs but implementing it for yourself on a limited catalog of songs is a couple days of work once you have the theory. in fact this was a lab project for the intro signal processing class at berkeley that i ta'd.


Thanks for the link. I thought these things involved a lot of markov models and gaussian functions but this is mostly just some pretty slick engineering. The 1000x search speed seems very good.


I can imagine that at the shazam's scale it's just a completely different ball game.


Check out this article that was on HN before:

http://royvanrijn.com/blog/2010/06/creating-shazam-in-java/


Search for “python fingerprinting dejavu”. The article linked in the GitHub repo is a must read.


A few others have posted references. We did it as a lab in my undergrad signal processing class. Theory is simple: transform time into frequency domain and you have a fingerprint of the song. I guess the value is in the execution, a la Shazam.


Transform from time into frequency domain and you have the Fourier transform of the song, not a fingerprint. Fingerprinting is a lot more involved than that.


Well, it's still not difficult. You just need a good bucket size for the FFT, some filtering, and a hash function. Voila.


So now you’ve fingerprinted an entire song. How do you go about matching that to the 7 seconds of music and loud background noise I actually present you with?


No idea. I'm sure that's where it gets challenging.

Off the top of my head: lots more noise analysis and reduction (ML?) + partial spectrum matching (some kind of common tones/beat fingerprinting?).

But I still stand by my response to the original comment.



There's an app as well as their huge database that the algorithm wouldn't work without.


it’s just a button. and in a perfect world I wouldn’t even need a button


Not sure why you are downvoted, it is true.

Execution is everything and I have found that the ondevice index shipping with pixel devices displaying the title of the currently playing song on the lockscreen was such a better implementation than having to hurry up and remember where the shazam app is, open it, and try to identify the song (optionally having to try several times and store the sample for later when I have more network).


> (optionally having to try several times and store the sample for later when I have more network).

Weird you brought that up since in the pixel implementation you'd just lose the song forever.


Well, I am trying to stay impartial :) .

While I think it is the best execution so far, it is not perfect.


the reason i am being downvoted is because hacker news is fucking garbage now. too many insecure, underqualified jackasses looking for a reason to argue.


Sure. Search, Buy, Publish, Analyze, Like are also just buttons.

But there are tens of thousands of the best developers and billions in investment to make those buttons work.


what is your point?

saying the algorithm "wouldn't work" without an app is like saying my door doesn't work without a doorknob.

OP made a point: Shazam is not magic, and people have the opportunity to compete or DIY. to me, it seems simple/contrarian/not helpful to respond with "they have an app too"


The point is that saying Shazam is just a button ignores the huge amount of effort that sits behind it. Not just the technology but more importantly the sales and marketing to get to the number of users they have.

Amazon or Youtube are not magic either. But good luck getting to that same scale without significant effort.


> saying Shazam is just a button

Nobody said that.

colordrops was claiming that the app itself was important, on top of the backend effort.

And it's not. The app is trivial.


and saying "they have an app" is ignoring OP's actual point, which is all i am trying to defend. no one is calling into question the merit it's taken to build Shazam as a company.

how do i find myself defending the most ridiculous shit here?... like an algorithm can't work without a UI.

> guy 1: "index the Fourier Transform of all songs"

> guy 2: "[don't forget about] their huge database[!]"

how is this not contrarian? biz guy spotted? the dude straight up fucking said to build a database of "ALL SONGS". if you think that's simple you're either not thinking hard enough or you have no idea what he's talking about.


That's ok. Apple is probably just buying it for there team of devs., right? The app./technology is just a bonus.


Did Shazam do any product innovation for a while now? They may not have many developers left, just running what they have?


Linkedin shows at least a few software engineers currently employed.


Google Search is a feature not a product: index the content of all web pages.

So not sure your statement makes much sense. When you have that much customer data then it's absolutely a product.


Google search is more than just indexing the content of all web pages though. It's also getting people to the result they're searching for which is significantly more complex.

That said, it's still a feature which is why Google makes it's money off of advertising instead. That's the real Google product in relation to their search feature.


Google Search is not a feature. It's a product. They have product owners, product managers, product roadmaps etc and it's fundamental to what Google is as a company. They themselves call it a product: https://www.google.com/intl/en/about/products/

And we can play this game all day if you like. By your ridiculous logic: Spotify, Facebook, WhatsApp, Amazon, Youtube are all just features.


Sure, there's a complexity scale where we arbitrarily consider something a "product" instead of a "feature". In fact the scale can slide as technology improves. Something that was previously complicated enough to consider a product may now be trivial enough to become a feature.


> Something that was previously complicated enough to consider a product may now be trivial enough to become a feature.

Mp3 players for example.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: