It terrifies me that even disclosing that you are attempting to understand a patent implies you intend to infringe upon it. It's as close to a thoughtcrime as possible without peeking into our heads. Worst of all, this is only happening thanks to our modern age of information sharing[1], which helps encourage innovation.
Even if this topic covered was complex enough such that a patent was warranted, this is still disturbing as this accusation of patent infringement aims to directly halt the understanding of any of the innovations at play. Patents are an exclusive right granted in exchange for detailed public disclosure of the invention - to promote understanding and innovation is their purpose!
This wasn't angled towards mass production, just simply understanding and spreading knowledge - a fundamentally important concept in our knowledged based field.
[1]: Quote: "Also, as I'm sure you are aware, your blogpost may be viewed internationally. As a result, you may contribute to someone infringing our patents in any part of the world."
This is the double bind. You can't go looking because you're penalized.
Yet the PTO and patent community refers to patents as "teachings". I continually heard that. "This patent 456 teaches that yadda yadda bing-bap-boom" and your very soul (well, professional career) is pillar-of-salt material if you even go look.
And yet, none of the really hard stuff seems to be patented. Or I could just be bitter. Or both :-)
"Let's see if someone's got prior art on (super whiz-bang technology)." Click, page, click . . .
"Wait, no--"
"Oh crap, I just read someone's hash table patent. Pass me another mind-wipe pill, would you?"
"Sigh. I'll have them hire another lawyer downstairs."
Which is what it's really about these days. Ka-ching, baby.
Even those that promote patenting algorithms should agree that patents purpose is opening the details of a machine or process in exchange for a monopoly.
That was the purpose of original patent's plans, that everybody could replicate your machine or process with the information provided.
When you want a monopoly and also ban publishing the details of the algorithm or working examples, something wrong is happening.
It has gone too far, with people recommending other people NOT to read patents, because it will be bad for them on court.
The reason that companies tell employees not to read patents is that knowingly infringing a patent is triple damages (as opposed to just accidentally reimplementing something).
I've read a bunch of software patents, with attorneys helping me out with language and advice, but this was under the guise of "How can we do a product like X, but not infringe on the related patents?" The patents in question were very broad, nearly without exception either ridiculously obvious or adoption of techniques that I found in textbooks and papers from conferences. It was almost like the company producing product X had bribed the PTO, it was that bad.
I believe that nearly every piece of software exceeding a few hundred lines of code infringes on someone's bullshit software or business methods patent. And of course, you can't go reading all the patents, so every software project is pre-screwed. Shipping software these days is a matter of ignoring the issue, crossing your fingers and hoping that a patent troll doesn't come knocking.
The PTO largely doesn't seem to care or do any research beyond the bare basics. As far as I can tell they're not really incentivised to do so either.
They collect a whole heap of fees for getting and maintaining a patent (http://www.uspto.gov/learning-and-resources/fees-and-payment...), and even claim a fee for revisiting a patent, but so far as I've been able to find, it doesn't cost them anything to actually invalidate one. It ought to cost them all the money they've taken so far, and then some on top, to provide them with more of an incentive to make sure the patent is correct.
Trolling hobbyists is indefensible, but it's simplistic to say that this is an obvious patent on "matching music using a hashing function." The patent in question is: 6,990,453. It describes doing a frequency domain analysis of the audio signal to obtain a set of landmarks, and using characterizations of those landmarks to obtain the fingerprints, and using the fingerprints to match songs: http://www.google.com/patents/US6990453. That's why Shazam can recognize recorded and transcoded music. Simple hashing would not do that.
While technically awesome, it's still math. It cuts so closely to being able to patent proofs or mathematical expressions (it's kind of hard to argue that's not exactly what it is) and I hope one day the judicial system understands software well enough to invalidate software patents.
It's technically amazing. Shaman still amazes me with its speed and accuracy. But in the end, it's simple (advanced) numerical analysis.
I'd much rather their implementation be held close to the vest as a trade secret, where stealing source code is illegal, but if I produce similar functionality on my own via my own work and time investment, I don't need to fear that someone has government sponsored exclusivity on that pattern of mathematical analysis.
This is a very prevalent misconception around here, so I will address it again. Saying "software is mathematics" is about as correct as saying "machines are physics". (Note that you cannot patent laws of physics either.)
I think it's natural to expect that an audio-specific hashing function would operate at least in part in the frequency domain, and that it would have to identify key features to ensure that similar sounds receive similar hashes. That much is obvious. Anyone who's ever heard of the FFT could get that far independently in five minutes, and should be free to do so without fear of patent lawsuits.
I don't know much about this patent other than what I read in the past few minutes, but:
1. The claims don't cover "use FFT, get key features". It actually claims something different but quite broadly.
2. Having some background in signal processing, I can tell you it's never that simple. Many years ago, I once took on a similar project that looked like "oh, a simple frequency-domain cross-correlation should suffice" and it turned into a multi-month exercise that ended without a satisfactory solution. I'm guessing the right solution would have looked a lot like this patent.
Taking this use case, for instance: What's a "key feature" that works best for your use case? You do an FFT, fine. What next? What do you hash? How do you hash it in a way to get something useful? You have a ton of information: phase, frequency, amplitude for an N-point FFT, and you have M such chunks of FFTs. How do you use these M x N points to solve your problem? Can you come up with something (without reading the patent, of course) that works reasonably well in a reasonably short time frame?
I've thought about the problem before, haven't read the patent, and I don't think I've ever used Shazam. By combining my prior thoughts with the state of the art of open research, I do think a small team could come up with something quite effective. I might start with the Annotator and Chordata examples from CLAM: http://clam-project.org/wiki/Frequenly_Asked_Questions#Which...
I'm always surprised by the confusion of well meaning, intelligent, creative but obviously naive hobbyists trying to understand the justice or logic when they are threatened by lawyers backed by organisations with lots of money.
Here's a wakeup call: Justice has nothing to do with any of this, and the only logic behind it is the golden rule, i.e that those who have the gold make the rules. This is nothing new. Satisfying greed of the powerful by invoking terror on the less powerful is a very common occurrence throughout human history. The only things that change in different times and places are the implementations of this rule.
Sorry to be so cynical about this, and I'd be delighted if you prove me wrong. It's ironic how patents today exist to mostly serve the exact opposite of their original intent.
Interesting letter showing lawyers think matching music thanks to an hashing function is not obvious (else why would they try to bully someone to remove that information from www).
Lawyers actually “think” nothing. Or rather, what they write and what they argue is what they are paid to write and argue. It has no actual connection to what they actually think - they might not even have actually personally considered the issue, and may never do so. However, what they write is in many ways meant to be taken seriously and reacted to, just like troll posts. Therefore, take care not to be trolled by lawyers.
A quick search on github.com (https://github.com/search?utf8=%E2%9C%93&q=shazam) lists a lot of implementation related to it. Bad that they are running behind a single guy. May be their implementation is exactly same as what is posted on the blog.
Even if this topic covered was complex enough such that a patent was warranted, this is still disturbing as this accusation of patent infringement aims to directly halt the understanding of any of the innovations at play. Patents are an exclusive right granted in exchange for detailed public disclosure of the invention - to promote understanding and innovation is their purpose!
This wasn't angled towards mass production, just simply understanding and spreading knowledge - a fundamentally important concept in our knowledged based field.
[1]: Quote: "Also, as I'm sure you are aware, your blogpost may be viewed internationally. As a result, you may contribute to someone infringing our patents in any part of the world."