Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I believe the voice recognition accuracy of Siri is far superior to Google Voice Actions and requires the dual core CPU.

Google has taken a different approach, where your voice sample is uploaded to a Google server, processed, and downloaded back to the device. This takes less CPU power but is also far less accurate, as the voice sample must be very low quality to have a quick response time from Google's server.

Apple/Siri are taking the approach that high quality voice recognition must be done on device in order to provide the level of performance and accuracy that voice recognition requires. I think we will find that Siri actually works and doesn't have as many errors as Google's voicemail transcription.

This is the reason for requiring iPhone 4S.



Why would it be less accurate? I've worked with voice quite a bit at my last job and the codecs and bitrates for voice don't need to be this heavy-duty CD-quality stuff. Your big limitation is going to be those cheesy mics and background noise, wind, etc.

Voice compresses nicely. Turns out we humans aren't capable of making such varied sound that it can't compress. Our mouth holes are ancient technology.

In practice, I'm in love with Google's voice capabilities. It seems to understand context. Its crazy how accurate it is. I often tease my iphone friends with it. I'm also highly skeptical that an application on a phone can outdo google's massive libraries and server infrastructure. If anything, I'd expect the Apple voice to be worse. Regardless, I can't wait to see this stuff in action. A war for the best voice recognition would be great right now as its been a patent blocked and ignored field for the most part.


You've got it backwards. Modern speech recognizers have a vocabulary of a million words and multi-gigabyte models. It's generally much more accurate to do speech recognition in the cloud, since you have more processing power and more RAM to hold large statistical models.

The rumor is that Apple is sending the audio to Nuance servers, i.e., they're doing cloud-based speech recognition.


False, they're doing on device speech recognition. Servers are only used if you request information from the Internet. Look at the demos and read the hands on reports. On device recognition makes it much more usable than Google Voice Actions.


+1 on the cloud advantage.

I tried Google Voice Actions on my Nexus One quite a while ago, but it was optimised for the US market. The accuracy for me was so bad that I didn't bother with it.

Then recently, a Google blog on RSS said the latest app had been optimised for my locale. Now, of course, it's spookily accurate.

You can't beat your algorithms in the cloud being bombarded with sample data round the clock.


    This takes less CPU power but is also far less accurate,
    as the voice sample must be very low quality to have a
    quick response time from Google's server.
Has there been a comparison of the accuracy of the two services? This claim seems unsupported.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: