Microsoft's custom voice recognition service hits public beta

The service lets developers tailor cloud voice recognition for specific scenarios

Companies building applications that leverage speech recognition have a new machine-learning based tool to improve their work. Microsoft is opening the public beta for its Custom Speech Service, the company said Tuesday.

The service, formerly known as CRIS, allows customers to train a speech recognition system to work in a specific scenario, allowing it to produce more accurate results. For example, the Custom Speech Service can be trained to provide better results in a noisy airport or set up to work better with voices from a particular group, like kids or people with different accents.

Right now, the Custom Speech Service works with English and Chinese, but one of its advantages is that it can be trained to work with accents from non-native speakers.

Microsoft is making it available as part of its suite of Cognitive Services, a set of cloud-based tools aimed at opening up the fruits of the company’s artificial intelligence and machine learning research to the rest of the world.

Right now, there are eight such cognitive services generally available, and an additional 17 in beta. More than 424,000 developers have tried the services since they launched, Microsoft said. Developers all over the world can access the services, many of which are available for purchase through Microsoft Azure.

Each of the services has a free tier with heavy limits on its use, so developers have the freedom to test the APIs out without spending a cent. The Custom Speech Service has a complicated, tiered pricing model that includes a subscription fee along with charges based on the number of voice samples fed into the system and the amount of acoustic adaptation training.

The Custom Speech Service is a key tool in the arsenal of Human Interact, a small game development shop using voice commands as the sole means of interaction for its forthcoming game Starship Commander. Custom speech recognition, along with Microsoft’s Language Understanding Intelligent Service (LUIS), makes up key parts of the voice recognition and understanding system that players use to guide their ship.

The service allows Human Interact to create its own dictionary specific to Starship Commander, which means the system can understand players when they ask about the Ecknians, the game’s alien antagonists. After players' speech has been translated into machine readable text, LUIS processes it and translates it into game commands.

Both systems are important to the core gameplay of Starship Commander. Human Interact set out to make an interactive experience for virtual reality that was broadly accessible to a wide range of players, not just those who have been playing video games for years, creative director Alexander Mejia said.

"The answer was stupidly clear," Mejia said. "What if you just talk to somebody? I mean, if we put a person in front of you, and they start talking to you, would you talk back?"

To that end, the company opted to use the microphones that are built into the Oculus Rift and Gear VR systems and create a game that feels like a much more open-ended and immersive choose-your-own-adventure book.

Microsoft is far from the only company providing machine learning-based cloud voice recognition, but its services were the best for what the team is doing, Mejia said. The services provide what the team needs for not only custom dictionaries, but also fast response times and the ability to see and validate the results that the voice recognition system puts out.

Two other cognitive services from Microsoft will reach general availability next month. The Content Moderator service is designed to automatically block objectionable content in text, videos, and images while allowing for human review of questionable cases. It can detect profanity in more than 100 languages and allows customers to include custom lists of objectionable text as well. 

The Bing Speech API is designed to give developers an easy, generalized way to convert speech to text and vice versa. It supports voice recognition from 18 languages and dialects from 28 countries, including German, French, Chinese, Spanish, and Arabic. Developers can also use the API to do text-to-speech work in 10 languages with support for dialects from 18 countries.  

Microsoft is battling with a number of other cloud companies in this area, including Google, Amazon, and IBM, which each have their own set of machine intelligence-based tools.

Join the PC World newsletter!

Error: Please check your email address.

Tags Microsoft

Our Back to Business guide highlights the best products for you to boost your productivity at home, on the road, at the office, or in the classroom.

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Blair Hanley Frank

IDG News Service
Show Comments

Most Popular Reviews

Latest News Articles


PCW Evaluation Team

Azadeh Williams

HP OfficeJet Pro 8730

A smarter way to print for busy small business owners, combining speedy printing with scanning and copying, making it easier to produce high quality documents and images at a touch of a button.

Andrew Grant

HP OfficeJet Pro 8730

I've had a multifunction printer in the office going on 10 years now. It was a neat bit of kit back in the day -- print, copy, scan, fax -- when printing over WiFi felt a bit like magic. It’s seen better days though and an upgrade’s well overdue. This HP OfficeJet Pro 8730 looks like it ticks all the same boxes: print, copy, scan, and fax. (Really? Does anyone fax anything any more? I guess it's good to know the facility’s there, just in case.) Printing over WiFi is more-or- less standard these days.

Ed Dawson

HP OfficeJet Pro 8730

As a freelance writer who is always on the go, I like my technology to be both efficient and effective so I can do my job well. The HP OfficeJet Pro 8730 Inkjet Printer ticks all the boxes in terms of form factor, performance and user interface.

Michael Hargreaves

Windows 10 for Business / Dell XPS 13

I’d happily recommend this touchscreen laptop and Windows 10 as a great way to get serious work done at a desk or on the road.

Aysha Strobbe

Windows 10 / HP Spectre x360

Ultimately, I think the Windows 10 environment is excellent for me as it caters for so many different uses. The inclusion of the Xbox app is also great for when you need some downtime too!

Mark Escubio

Windows 10 / Lenovo Yoga 910

For me, the Xbox Play Anywhere is a great new feature as it allows you to play your current Xbox games with higher resolutions and better graphics without forking out extra cash for another copy. Although available titles are still scarce, but I’m sure it will grow in time.

Featured Content

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?