Speech recognition software is designed to ease PC use for users who find traditional keyboard input difficult, and recently it was used by US intelligence agencies to translate foreign language news networks into English in real-time. But in the home, can your PC and the current speech software live up to the hands-free hype? In this tutorial we investigate speech dictation and recognition, and how to really get up and speaking, so you can see if it is time to talk the talk.
Microsoft has included speech recognition functionality in its Tablet PC edition of Windows XP and to a lesser extent in its Office XP software (in Word 2002 go to Tools-Speech). There are also stand-alone speech recognition packages available, most notably IBM’s ViaVoice (www.ibm.com/au) which starts at $137.50. Dragon Naturally Speaking 7 (www.scansoft.com) has a few different versions which start at $99.95 (for the Essentials version). Dragon’s Naturally Speaking ‘Preferred’ version ($399.95), which is bundled with a microphone headset, was used for testing in this article.
The Custom Installation allows you to choose the optional tutorial files and, more importantly, choose your English pronunciation/spelling pack type. American, British, Australian, Indian and South East Asian versions of English are available for selection.
Now you need to enable the microphone and speaker settings, so you can speak and hear what you are dictating. Double-click on the speaker (volume) icon in your taskbar — if you can’t see it, go to Control Panel-Sound and Audio Device Properties and tick the Place volume icon on the taskbar option. From the volume controls, pull down the Options menu, select Properties and make sure Microphone is selected — you may have to select the Recording radial dial to view this on some machines. Now click OK and ensure that the Microphone volume control is not set to mute, and that it is set relatively low. This will stop you from creating feedback — which may completely scare you, as well as potentially harm your speakers — when you follow the supplied instructions to plug in the headset.
You’ll also need to choose whether to send audio to your headset or your desktop speakers. If you decide to use the headset speaker, don’t forget to turn your PC’s speakers down or off before unplugging them (to protect against damage) and to turn down the Wave volume slider. Learning to speak
Continue through the new user wizard. This steps you through microphone testing and basic ‘training’ of the software to recognise your voice’s inflections. Follow the arrow prompt to start reading aloud and do so in a clear, deliberate but natural manner (think news anchor style!). Words will become grey when the software recognises them. To take a break, click the Pause button.
TIP: if you have a high resolution monitor setting with small fonts, you may wish to lower the setting to make the reading of training passages easier on the eyes. Also, you’ll be doing a lot of speaking, so have a drink handy! General training takes about 15 minutes.
The Dragon Bar
The main interface of the program includes the Tools menu where the Accuracy Centre is located. This is a central location for many tools to help you to improve accuracy, and includes over 15 additional training scripts. The additional scripts take about 10 to 15 minutes each to perform, and are recommended to improved the overall experience, even though the initial recognition script provides a high degree of success, most of the time.
The ‘Command browser’ and ‘Add new command’ are also located under the Tools menu. Commands are special voice prompts that the software will recognise and perform as a task upon registering them. A quick reference card is included in the box listing common commands used for the navigation, formatting and punctuation. These can also be looked up in the Command browser. Note: if you state/read commands too slowly, they will be interpreted as dictated text and not a command.
To start dictating, click the microphone icon at the left of the Dragon Bar to turn the microphone on (or use the numeric pad plus key to toggle it on/off). Then open a new document in your desired application — Outlook, MS Word, even ICQ and other IMs are available — and commence dictation. As you dictate, it takes a few moments for the text to appear but you can read what’s being interpreted in the small yellow box on screen.
TIP: select Transcribe Recording from the Sound menu to open a WAV file and have the speech component turned to text.
It seems that speech recognition now has many feasible and productive applications. As accuracy increases and the ability grows to control and navigate Windows using speech commands, the future of speech recognition software looks brighter. n
Speech recognition software no longer needs the latest PC to run effectively, although it will help! The minimum recommended system for NaturallySpeaking 7 is a 500MHz PIII processor, 128MB of RAM, 300MB hard disk space and a good sound card (and most are these days) capable of recording. Hardware aside, you’ll need a quiet environment and an hour or so set aside.