Speech recognition is one of those pie in the sky features that we always get excited about, but somehow just does not work all that great. It makes for a great demo, but we all know sooner or later we do not use it all that much – be it because of poor recognition or just that it does not really save time. After all, speech recognition has been a feature in most higher end phones for quite some time now, yet you do not see too many people using it. Except maybe in a car.
Now enhancements in the AI is about to revolutionize the way we talk to our gadgets – just think how good Google Voice Search has become (with multiple language support), and the way we can tell our iPhone to play a specific album. There is also talks that Apple is building speech recognition into iOS5 (which will probably ship only with iPhone 5). Apple has recently been in talks to build Nuance technology (the folks behind Dragon Naturally Speaking) into the iPhone, and Apple also recently bought “Siri”, the personal assistant app.
So you future command to your phone wont only be to “Call Jane”, but might just be: “Arrange dinner with Jane tomorrow night at Italian restaurant in Cape Town”. Then using numerous services on the web arrange all of this for you… Take a look at this video of Siri to see what I on about:
Sure, we as South Africans might have some challenges because our accents do not match the yanks (who typically get catered for first), and many of the online services do not always arrive on our shores, but it is exciting nonetheless. Only time will tell whether these enhancements do mean more people use speech recognition.
But how does it all work?
This infographic does a pretty good job of explaining automatic speech recognition (look out for the Mike Tyson quip):