"How is the weather in Tokyo?" A demonstrator is questioning "Maico," a cute, three-dimensional, 26-year-old female on a display screen. "Yes? the weather in Tokyo? let me check," nods and answers Maico, looking straight at the demonstrator. The computer then searches weather information for the Tokyo area via the Internet.
The demonstration was part of Sharp Corp.'s unveiling, at the Real World Computing (RWC) project's final exhibition and symposium last week in Tokyo, of a new type of interface that communicates with users. The technology underlying what Sharp calls its "multimodal agent interface" can be adopted as part of an easy-to-use, communicative Internet search engine, and expanded as an interface for home networking appliances, company officials said.
The company will able to provide commercially available systems that use this type of interface in four to five years, officials said.
When a user speaks to a video camera equipped PC through a microphone, a computer-generated (CG) character starts recognizing and responding to the user's voice and gestures by nodding. This nodding function acts as an important part of what Sharp calls its multimodal agent interface.
The timing of the nod is crucial, in order to simulate real conversation. The company succeeded in making the CG character nod at the right time, right after a user speaks to it, just like a real person-to-person chat, said Toshiro Mukai, an engineer at Sharp's system technology development center.
The interface is based on a nonverbal technology model and a language model. The former reads the user's voice volume and gestures to formulate the timing for the right nodding reactions. The latter recognizes users' words and movements, and commands the CG agent to react verbally and physically, with gestures. These developments give users a sense of "the computer as your chatting companion," Mukai said.
As part of the demonstration, the company also developed "Gabriel," a CG male figure, which does not recognize voice commands but responds to the volume of voices by nodding. Therefore, users who speak any language can talk to Gabriel and get a nodding reaction from him. "I think nodding timing is a universal thing. People who don't speak Japanese tried the system and loved it," Mukai said.
Currently, Maico can search for weather information and TV listings. Information requests can be as detailed as "What TV program will be on channel 8 at 7 p.m. tonight?" Maico can also answer questions such as "What is your favorite food?" or "Can you raise your right hand?" Sometimes, Maico asks the user to make a gesture in return and if the user raises the wrong hand, it would point it out, Mukai said.
"We didn't want to make just a substitute of a remote control," Mukai said, noting his team includes artificial intelligent researchers. "Our priority was to create a machine that becomes a friend of users, rather than to create a convenient interface that can network home appliances."
The interface software can be put onto any type of hardware, for example, a stuffed animal, which could become a "chatting companion" for elderly people who live alone, he said.
The RWC project was started in 1992 by Japan's Ministry of Economy, Trade and Industry for the promotion of new information technology. The 10-year-project has marked its final year and last week's exhibition was held to show all research achievements to date from 54 participating laboratories from all over the world.