It's surprisingly easy to do. I know a company that has a bank of servers listening to streaming radio stations all day and night, fingerprinting the songs and taking the track and artist names from the stream to create an in-house Shazam.
I also happen to know the company you’re talking about. However, what they’re doing is way more complicated than a simple Fourier transform, it’s not easy to do at all.
Crawl public facebook videos, automatically figure out what they were watching/listening to in the background, and correlate that with their public demographic information.
Radio stations use this kind of technology to keep track of the amount commercial content is played which gives then gives them a indication on royalties they need to pay.