This preview shows page 1. Sign up to view the full content.
Unformatted text preview: h spoken word is matched
against similarly formed pre-stored words in the computer's electronic dictionary. The
creation of the electronic dictionary (a database of pre-stored words) is done during a
training session. That is, the system is initially operated in a training mode when the
words and phrases to be stored in the electronic dictionary are spoken several times to
train the system to recognize them. In this mode, the patterns for all the spoken words and
phrases spoken during the training session are created and stored for future matching.
5. Perform corresponding action. When a match for the spoken word(s) is found, it is
displayed on the computer's terminal and/or the appropriate action corresponding to the
input is performed (for example, stop doing whatever it was performing). In some case,
the system may display the word(s) and wait for operator's confirmation to prevent doing
something different from what the operator intended (in case of wrong interpretation of
the spoken word(s) by the computer). If no match is found for the spoken word, the
speaker is asked to repeat the word.
Voice recognition systems are normally classified into the following two categories:
1. Speaker-dependent. Due to the vast variation in the accent of different speakers,
most voice, recognition systems of today are speaker-dependent. That is, they can
recognize the speech of only a single individual or a few individuals.
individuals must have participated in the training session to train the system to recognize
their accent of the pre-stored words in the computer's dictionary. The system usually
maintains a different database of pre-stored words for different individuals because the
digital form of the same word may be different for different individuals.
2. Speaker-independent. Speaker-independent voice recognition systems can recognize
words spoken by anyone. Based on the idea of making a speaker-dependent system for
more than one individual, it is clear that speaker-independent systems will require a very
large database of pre-stored words to accommodate anyone's voice pattern. To take care
of this practical problem, speaker-independent systems are designed to have a very
limited vocabulary. For example, the vocabulary of such a system may have the words
YES, NO, ZERO, ONE, TWO, THREE, FOUR, FIVE, SIX, SEVEN, EIGHT, NINE and
View Full Document
This document was uploaded on 04/07/2014.
- Spring '14