Resolved and Unresolved issues in Windows Speech Recognition

(Speech Recognition Office 2003 vs. Windows Speech Recognition)

November 2005, Updated June 2009

Itamar Even-Zohar

Itamar@even-zohar.com

 

 

 

Windows XP Speech Recognition in Office 2003

(MS-SR)

 

Windows Speech Recognition (WSR)

 

Comments

1.   

MS Speech works only in MS applications (Office and WordPad).

MS speech works in various applications, not just Office. However, it won’t work fully in applications that do not support Text Services Framework (TSF), for example Open Office.

See Wikipedia entry for Text Services Framework.

2.   

No mouse and keyboard emulation.

Full mouse and keyboard emulation.

 

3. 

A relatively limited set of commands.

Many new commands have been added.

More commands are available and can be created with the new macroing feature. See below.

4. 

No Transcription.

Still no Transcription.

Martin Markoe-BratT’s Toolkit offers a basic transcription function.

5. 

No Multilingual abilities.

Six languages are offered with Speech Recognition. Each, however, must be handled separately and requires changing the computer’s display language.

The languages are: English (US and UK), Chinese, French, German, Japanese, and Spanish.

For details see “Multilingual Language use in Windows Vista

6. 

You must switch to Command Mode for extended navigation commands.

No more need to switch. All commands are available at the same time.

 

In case of recognition difficulty, you can now force command by prefacing it with “Computer.” If you want to force a dictation utterance, preface the utterance with “Insert.”

It is nevertheless helpful to have a Command Mode and a Dictation Mode available AS AN OPTION. In DNS, you can force command or dictation by pressing CTRL or SHIFT. Also, you can switch to Command Mode.

7.              

There is no Configuration feature to configure preferences (such as the format of punctuation or measures.

Still no proper configuration. You can now configure the number of spaces after terminating punctuation.

DNS has a whole array of configurable preferences. Spaces after punctuation are just ONE item among MANY.

8.              

The Language Bar does not fuse with the other menus in Office and often stands in the way.

No change.

 

9.              

No correction box to comfortably enter a correction when the pop-up list of alternatives does not offer the correct item.

There is a full correction box (“Correction Dialogue”), but still no CORRECTION LINE to select (from the list) an item for modifying it by voice or by typing.

The Correction Box is called “Spell dialog box” in DNS

10.           

It is necessary to repeat typing the first characters when typing the correct item if the first two characters or more coincide with the first characters of items on the list of alternatives.

This particular problem does not occur in the new Correction Dialogue.

 

11.           

Extra space after a left parenthesis and a left quote subsists.

(If you wish to circumvent this problem, you must not pause even slightly after dictating the punctuation mark.)

It seems that this bug has been fixed in MS-Word, though probably not yet in WordPad.

 

12.           

Extra “00” are added to full numbers (like $2000.00 instead of $2000).

This has been fixed.

 

13.           

No way to convert text to numbers or numbers to text by voice

(Like DNS “Format That Text”–“Format That Numbers”.

No change. Rob Chambers’ comment:

Correct. We’re not planning on implementing that for Vista. We could provide a macro that would do this, though.

This feature is long since built-in in DNS.

14.           

No explicit and direct facility to create dictation macros.

A full macro language is available as an add-on. The current version supports all languages. To download it, click here. For a library of macros go to the Yahoo Group Files Section.

For a Manual and other documentation about macro writing go to the LINKS section of the Yahoo Group.

15.           

No ability to create alternative commands/strings of commands (command macros).

A full macro language is available as an add-on. The current version supports all languages. To download it, click here. For a library of macros go to the Yahoo Group Files Section.

 

16.           

No possibility to extract the sound component from the text and export it to be kept as a separate sound file.

This has not been fixed.

Rob Chambers’ comment:

We have no tool or user feature to allow you to convert text to a synthesized voice and save it as a .WAV, .WMA, or .MP3.

A COMMENT TO Rob’s comment:

 

I was NOT referring to a synthesized voice but to the speaker’s own voice.

17.           

Speech data can be saved with file.

No ability any longer to save speech data.

Rob Chambers’ comment:

“We don’t have any plans to change this for Vista.”

It is a real pity that such a highly important feature for correction and proofreading has been eliminated.

18.           

There is a Text-to-Speech button on the Language bar.

No Text-to-Speech button on the Language bar. No ability to playback dictation. Rob Chambers has written a macro, Read_That, which allows reading by the computer voice, but it can be activated only by voice.

This is very unhelpful, as playback helps improving dictation behavior.

19.           

There is a Correction Button on the Language Bar.

No Correction Button on the Language Bar.

 

20.           

Profile (“user”) can be saved, exported and imported only via an add-on (not included in Office 2003 and known only to people who are members of the Yahoo group)

A Save/Restore add-on for saving and restoring User Profiles was made available in June 2009. To download it, click here.

This is a huge step forwards for WSR.

21.           

Added words cannot be exported as an editable list (like DNS)

Rob Chambers has created a macro which can take care of exporting and importing entries to the Speech Dictionary. The macro is available from the Macro Library of the Yahoo Group.

A built-in feature would make it more useful.

22.           

Documentation is still very scant. Nothing has been added to the Help file. Installation of the SR feature is not explicit.

Rob’s comment:

We’ll have better help, likely. I’m hoping that since it’s included in all versions of the OS, that we’ll get some good authors to write books on Windows Speech Recognition though.

 

23.           

No Tutorial to make SR more accessible to new beginners.

There is a new integrated tutorial/training wizard.

I also recommend Dan Newman’s “Talk to Windows” – an easy-to-use set of video lessons that shows you how to work with WSR.