Trolling hobbyists is indefensible, but it's simplistic to say that this is an o...

themartorana · on May 23, 2015

While technically awesome, it's still math. It cuts so closely to being able to patent proofs or mathematical expressions (it's kind of hard to argue that's not exactly what it is) and I hope one day the judicial system understands software well enough to invalidate software patents.

It's technically amazing. Shaman still amazes me with its speed and accuracy. But in the end, it's simple (advanced) numerical analysis.

I'd much rather their implementation be held close to the vest as a trade secret, where stealing source code is illegal, but if I produce similar functionality on my own via my own work and time investment, I don't need to fear that someone has government sponsored exclusivity on that pattern of mathematical analysis.

throwawaykf05 · on May 24, 2015

> ... it's still math...

This is a very prevalent misconception around here, so I will address it again. Saying "software is mathematics" is about as correct as saying "machines are physics". (Note that you cannot patent laws of physics either.)

nitrogen · on May 24, 2015

I think it's natural to expect that an audio-specific hashing function would operate at least in part in the frequency domain, and that it would have to identify key features to ensure that similar sounds receive similar hashes. That much is obvious. Anyone who's ever heard of the FFT could get that far independently in five minutes, and should be free to do so without fear of patent lawsuits.

throwawaykf05 · on May 24, 2015

I don't know much about this patent other than what I read in the past few minutes, but:

1. The claims don't cover "use FFT, get key features". It actually claims something different but quite broadly.

2. Having some background in signal processing, I can tell you it's never that simple. Many years ago, I once took on a similar project that looked like "oh, a simple frequency-domain cross-correlation should suffice" and it turned into a multi-month exercise that ended without a satisfactory solution. I'm guessing the right solution would have looked a lot like this patent.

Taking this use case, for instance: What's a "key feature" that works best for your use case? You do an FFT, fine. What next? What do you hash? How do you hash it in a way to get something useful? You have a ton of information: phase, frequency, amplitude for an N-point FFT, and you have M such chunks of FFTs. How do you use these M x N points to solve your problem? Can you come up with something (without reading the patent, of course) that works reasonably well in a reasonably short time frame?

nitrogen · on May 24, 2015

I've thought about the problem before, haven't read the patent, and I don't think I've ever used Shazam. By combining my prior thoughts with the state of the art of open research, I do think a small team could come up with something quite effective. I might start with the Annotator and Chordata examples from CLAM: http://clam-project.org/wiki/Frequenly_Asked_Questions#Which...