Researchers shout about voice recognition

The ASSF (auditory spectrum-based speech feature) approach could be found in commercial products in three to five years, opening the door to applications such as voice-controlled Web surfing on mobile phones or personal digital assistants, the researchers say.

ASSF uses less processing power than the widely used MFCC (Mel function cochlear coefficient) technology, because it looks at fewer parameters when it interprets the wave forms of the user's speech. Instead of filtering the sound for all those parameters, ASSF uses more sophisticated decision rules for dealing with the data it gathers about the wave forms.

The decision rules, unlike the complex algorithms used in interpreting wave forms, can run in memory--for example, RAM in the case of a PC--so ASSF can run on a system with a less powerful processor, says Chuang Wen-hao, a Hong Kong Polytechnic University lecturer who described the technology at Game Technology Conference 2001.

In a quiet setting, ASSF makes more errors than does the current MFCC, but it outshines MFCC in a noisy environment, similar to what users would find while riding a subway, he says. Chuang's team is now working on reducing those speech recognition error rates--now more than 70 percent errors in the noisiest setting--to useable levels.

Complex commands

In addition, it is best suited to recognition of commands rather than complex statements. However, the technology could allow for handheld devices that respond to more complex commands than can be used on phones today, which generally are limited to statements such as "Call Bob" and may be affected by outside noise, according to Chuang.

"If you're in a noisy environment, the speech recognition feature might not work very well," Chuang says.

Another potential use of ASSF speech recognition is voice-controlled computer games, an application that was highlighted. A sophisticated voice-controlled game could respond quickly to commands such as "Fire!" even under noise conditions that might be very different from the typical office environment, Chuang says.

"One of the challenges for game companies is to create a more immersive environment for users," Chuang says.

Chuang and a colleague, lecturer William Wang-gen Wan, are leading a team of five researchers at Hong Kong Polytechnic University, Hong Kong University of Science and Technology, and Shanghai University. The project began in mid-1999, he says. Although the team has no commercial partners, it would welcome such a partnership, he says.

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Stephen Lawson

PC World

Comments

Comments are now closed.

Most Popular Reviews

Follow Us

Best Deals on GoodGearGuide

Shopping.com

Latest News Articles

Resources

GGG Evaluation Team

Kathy Cassidy

STYLISTIC Q702

First impression on unpacking the Q702 test unit was the solid feel and clean, minimalist styling.

Anthony Grifoni

STYLISTIC Q572

For work use, Microsoft Word and Excel programs pre-installed on the device are adequate for preparing short documents.

Steph Mundell

LIFEBOOK UH574

The Fujitsu LifeBook UH574 allowed for great mobility without being obnoxiously heavy or clunky. Its twelve hours of battery life did not disappoint.

Andrew Mitsi

STYLISTIC Q702

The screen was particularly good. It is bright and visible from most angles, however heat is an issue, particularly around the Windows button on the front, and on the back where the battery housing is located.

Simon Harriott

STYLISTIC Q702

My first impression after unboxing the Q702 is that it is a nice looking unit. Styling is somewhat minimalist but very effective. The tablet part, once detached, has a nice weight, and no buttons or switches are located in awkward or intrusive positions.

Latest Jobs

Shopping.com

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?