Go to Bottom Full Blog Info

Speech Technology Will Be Really Big – Watch Google

Andy Capp

If you're new here, you may want to subscribe to the RSS feed for this blog. Thanks for visiting!
 
Phonemes wanted – talk to Google

If you want confirmation that speech technology is the next big technical and economic opportunity, then keep an eye on Google. This year they encouraged the formation of the Open Handset Alliance. This undermines the walled gardens created by the existing telecom companies. The picture now is very much a more level and competitive playing field.

It is interesting to see how Google is now developing its own stake in what will be a highly profitable marketplace. Marissa Mayer, Google’s vice president of Search Products & User Experience, in an interview (Google wants your phonemes) revealed one part of the effort.

You may have heard about our [directory assistance] 1-800-GOOG-411 service. The reason we really did it is because we need to build a great speech-to-text model.

The speech recognition experts that we have say: If you want us to build a really robust speech model, we need a lot of phonemes, which is a syllable as spoken by a particular voice with a particular intonation. So we need a lot of people talking, saying things so that we can ultimately train off of that. … So 1-800-GOOG-411 is about that: Getting a bunch of different speech samples so that when you call up, we can (understand) with high accuracy.

This approach is adopted because Google Is All About Large Amounts of Data. Peter Norvig, director of research at Google, believes the following:

The way to get better understanding of text is through statistics rather than through handcrafted grammars and lexicons. The statistical approach is cheaper, faster, more robust, easier to internationalize, and so far more effective.

We wanted speech technology that could serve as an interface for phones and also index audio text. After looking at the existing technology, we decided to build our own. We thought that, having the data and computational resources that we do, we could help advance the field. Currently, we are up to state-of-the-art with what we built on our own, and we have the computational infrastructure to improve further. As we get more data from more interaction with users and from uploaded videos, our systems will improve because the data trains the algorithms over time.

Google is certainly in a privileged position to gain access to large amounts of data that can be used to improve other services. However it seems somewhat paradoxical to be using number crunching to better understand language and speech.

Others take a different view. For example, Powerset is building a consumer search engine based on breakthrough natural language processing technology licensed from PARC and developed internally. The search engine aims to leverage the structure and nuances of natural language to ultimately transform the way humans interact with computers.

It will be interesting to see which approach wins out.

Related: Can You Hear The Future?

Sphere: Related Content

Related Posts

Technorati Tags: , ,

For e-mail versions of new blog posts as soon as they are published,
please enter your Email address:

Delivered by Google FeedBurner

If you enjoyed this blog post and need some help in getting your blog posts to appeal to readers and attract more traffic via Google, then why not explore how the SMM Blog Post Writing Service can help you. If all you need are creative ideas at a bargain price, check the Blog Post Title-Plus Service.

3 Responses to “Speech Technology Will Be Really Big – Watch Google”

  1. Dito Says:

    wow i never thought of google collecting our voices thru that service to improve their software…what a brilliant idea.

  2. Aaron Says:

    Dito,

    I actually thought about that when I was using GOOG-411. Everytime someone says “go back” or “start over” or anything that would make you repeat what you wanted, they are probably tracking it to further improve their software with different accents and dialects. They analyze everything else, why not Google Voice Analytics

  3. james johnson Says:

    Google can steal my voice all they want! As long as when I call companies they can actually understand what i’m saying! I hate yelling at the phone and pressing zero repeatedly until I actually find an operator. I use Goog-411 and have had very little problems, except when a company is pronounced different than its spelled, I just use the “Spelling Pronunciation” though and it works fine! Great article either way, keep up the good work.

 

Most Popular Articles from the Archives

Why not sample a few of the other blog posts that visitors have found of interest.