Go to Bottom Full Blog Info

Speech Technology Will Be Really Big - Watch Google

Andy Capp

If you're new here, you may want to subscribe to the RSS feed for this blog. Or you can subscribe to a combined news feed for all SMM publications. Thanks for visiting!
 
Phonemes wanted - talk to Google

If you want confirmation that speech technology is the next big technical and economic opportunity, then keep an eye on Google. This year they encouraged the formation of the Open Handset Alliance. This undermines the walled gardens created by the existing telecom companies. The picture now is very much a more level and competitive playing field.

It is interesting to see how Google is now developing its own stake in what will be a highly profitable marketplace. Marissa Mayer, Google’s vice president of Search Products & User Experience, in an interview (Google wants your phonemes) revealed one part of the effort.

You may have heard about our [directory assistance] 1-800-GOOG-411 service. The reason we really did it is because we need to build a great speech-to-text model.

The speech recognition experts that we have say: If you want us to build a really robust speech model, we need a lot of phonemes, which is a syllable as spoken by a particular voice with a particular intonation. So we need a lot of people talking, saying things so that we can ultimately train off of that. … So 1-800-GOOG-411 is about that: Getting a bunch of different speech samples so that when you call up, we can (understand) with high accuracy.

This approach is adopted because Google Is All About Large Amounts of Data. Peter Norvig, director of research at Google, believes the following:

The way to get better understanding of text is through statistics rather than through handcrafted grammars and lexicons. The statistical approach is cheaper, faster, more robust, easier to internationalize, and so far more effective.

We wanted speech technology that could serve as an interface for phones and also index audio text. After looking at the existing technology, we decided to build our own. We thought that, having the data and computational resources that we do, we could help advance the field. Currently, we are up to state-of-the-art with what we built on our own, and we have the computational infrastructure to improve further. As we get more data from more interaction with users and from uploaded videos, our systems will improve because the data trains the algorithms over time.

Google is certainly in a privileged position to gain access to large amounts of data that can be used to improve other services. However it seems somewhat paradoxical to be using number crunching to better understand language and speech.

Others take a different view. For example, Powerset is building a consumer search engine based on breakthrough natural language processing technology licensed from PARC and developed internally. The search engine aims to leverage the structure and nuances of natural language to ultimately transform the way humans interact with computers.

It will be interesting to see which approach wins out.

Related: Can You Hear The Future?

Other Related Posts

Technorati Tags: , ,
If you liked this and/or found it useful, why not share it with others in your favourite social media, using the eKstreme.com Socializer.
 

2 Responses to “Speech Technology Will Be Really Big - Watch Google”

  1. Dito Says:

    wow i never thought of google collecting our voices thru that service to improve their software…what a brilliant idea.

  2. Aaron Says:

    Dito,

    I actually thought about that when I was using GOOG-411. Everytime someone says “go back” or “start over” or anything that would make you repeat what you wanted, they are probably tracking it to further improve their software with different accents and dialects. They analyze everything else, why not Google Voice Analytics

Leave a Reply