Google throws its weight behind voice

The company's size helps in speech recognition development, a researcher says

Google is taking advantage of its cloud infrastructure and the huge volume of typed search queries to refine its Voice Search function, part of a massive research effort in voice that spans both mobile devices and the Web.

Voice Search, introduced about 18 months ago, lets mobile users search the Web by speaking into their phones rather than typing in a query. It's available on the iPhone, BlackBerry, Nokia Series 60 devices and some Android phones.

Accuracy is a major factor for success, driving useful results that cause users to return to the service, said Michael Cohen, manager of speech technology at Google, in a speech Thursday at the Mobile Voice Conference in San Francisco. The company strives to make Voice Search a "frictionless" experience for the user, with correct results obtained easily. Making speech recognition more accurate has been a decadeslong effort, and Google is applying its massive scale to the problem, Cohen said.

Voice Search is based on "language models," which are statistical models of what sequences of words are most likely to occur. For example, a good language model would know that it's more likely a speaker would say "the dog barked" than "the dog talked."

Google is constantly "training" new language models for its speech recognition engine, Cohen said. In doing so, it taps into the search terms that users type into Google.com. From 230 billion words typed in search requests at Google.com, researchers have compiled the 1 million most-frequently used unique words to form a vocabulary with which to train the voice system. Both numbers are arbitrary, and 230 billion does not represent the total number of words entered at Google in any given period, Cohen said. AskOxford.com, from the publisher of the Oxford English Dictionary, estimates that there are at least 250,000 words in the English language; Cohen said the 1 million unique words include plurals and other versions of words.

It takes 70 "CPU years" -- the amount of work one CPU can perform in a year -- to process those 230 billion words from Google.com and train a new language model, Cohen said. Google trains new language models constantly as part of its research.

"There are huge computational demands as we're taking on lots and lots of data (and) bigger and bigger models," Cohen said. "Luckily, we have a lot of compute power we can apply to that. And there are demands on infrastructure, and luckily, Google has a very well-designed software infrastructure, so we can do things like quickly parallelize something," running it on thousands of computers at the same time, he said.

A cloud infrastructure offers other advantages in speech recognition, he said. For one thing, Google can rapidly test and refine its speech recognition software, sending out new versions, while consumers are using it in the field. In addition, as consumers use Voice Search, Google learns from real-world experiences.

In addition to making speech recognition easier to use, Google wants to make it ubiquitously available. A big step in that direction was a feature included in the Nexus One handset that gives the user the option of speaking instead of typing every time the keyboard pops up on the phone's screen, Cohen said.

Speech recognition is also a big part of Google Voice, powering its voicemail transcription feature. But Google's interest in voice goes beyond mobile phones, Cohen said. Voice is the biggest group in Google Research, and findings in this area can be useful in many areas, he said. The company wants to be able to understand and deliver spoken content on the Web as well as the written information it finds now through its search engine. One recent move was the addition of a closed-caption option for YouTube videos. Using that capability, Google is also beginning to offer foreign-language subtitles through text-to-text translation of those captions.

Cohen was a co-founder of Nuance Communications and has been working on speech recognition for 25 years. In that time, "It's come a long way, but it has a long way to go," he said.

Microsoft is also developing voice search capabilities for its Bing search engine.

Join the newsletter!

Or

Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.

Membership is free, and your security and privacy remain protected. View our privacy policy before signing up.

Error: Please check your email address.

Tags Googlesearch enginesgoogle voice

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Stephen Lawson

IDG News Service
Show Comments

Brand Post

Bitdefender 2019

Taking cybersecurity to the highest level and order now for a special discount on the world’s most awarded and trusted cybersecurity. Be aware without a care!

Most Popular Reviews

Latest Articles

Resources

PCW Evaluation Team

Luke Hill

MSI GT75 TITAN

I need power and lots of it. As a Front End Web developer anything less just won’t cut it which is why the MSI GT75 is an outstanding laptop for me. It’s a sleek and futuristic looking, high quality, beast that has a touch of sci-fi flare about it.

Emily Tyson

MSI GE63 Raider

If you’re looking to invest in your next work horse laptop for work or home use, you can’t go wrong with the MSI GE63.

Laura Johnston

MSI GS65 Stealth Thin

If you can afford the price tag, it is well worth the money. It out performs any other laptop I have tried for gaming, and the transportable design and incredible display also make it ideal for work.

Andrew Teoh

Brother MFC-L9570CDW Multifunction Printer

Touch screen visibility and operation was great and easy to navigate. Each menu and sub-menu was in an understandable order and category

Louise Coady

Brother MFC-L9570CDW Multifunction Printer

The printer was convenient, produced clear and vibrant images and was very easy to use

Edwina Hargreaves

WD My Cloud Home

I would recommend this device for families and small businesses who want one safe place to store all their important digital content and a way to easily share it with friends, family, business partners, or customers.

Featured Content

Product Launch Showcase

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?