First impression on unpacking the Q702 test unit was the solid feel and clean, minimalist styling.
Researchers shout about voice recognition
- — 23 January, 2001 10:45
The ASSF (auditory spectrum-based speech feature) approach could be found in commercial products in three to five years, opening the door to applications such as voice-controlled Web surfing on mobile phones or personal digital assistants, the researchers say.
ASSF uses less processing power than the widely used MFCC (Mel function cochlear coefficient) technology, because it looks at fewer parameters when it interprets the wave forms of the user's speech. Instead of filtering the sound for all those parameters, ASSF uses more sophisticated decision rules for dealing with the data it gathers about the wave forms.
The decision rules, unlike the complex algorithms used in interpreting wave forms, can run in memory--for example, RAM in the case of a PC--so ASSF can run on a system with a less powerful processor, says Chuang Wen-hao, a Hong Kong Polytechnic University lecturer who described the technology at Game Technology Conference 2001.
In a quiet setting, ASSF makes more errors than does the current MFCC, but it outshines MFCC in a noisy environment, similar to what users would find while riding a subway, he says. Chuang's team is now working on reducing those speech recognition error rates--now more than 70 percent errors in the noisiest setting--to useable levels.
In addition, it is best suited to recognition of commands rather than complex statements. However, the technology could allow for handheld devices that respond to more complex commands than can be used on phones today, which generally are limited to statements such as "Call Bob" and may be affected by outside noise, according to Chuang.
"If you're in a noisy environment, the speech recognition feature might not work very well," Chuang says.
Another potential use of ASSF speech recognition is voice-controlled computer games, an application that was highlighted. A sophisticated voice-controlled game could respond quickly to commands such as "Fire!" even under noise conditions that might be very different from the typical office environment, Chuang says.
"One of the challenges for game companies is to create a more immersive environment for users," Chuang says.
Chuang and a colleague, lecturer William Wang-gen Wan, are leading a team of five researchers at Hong Kong Polytechnic University, Hong Kong University of Science and Technology, and Shanghai University. The project began in mid-1999, he says. Although the team has no commercial partners, it would welcome such a partnership, he says.