10asr-handout

10asr-handout - Massachusetts Institute of Technology...

Info iconThis preview shows pages 1–6. Sign up to view the full content.

View Full Document Right Arrow Icon
Massachusetts Institute of Technology 6.345/HST.728 Automatic Speech Recognition Spring, 2010 5/4/10 Lecture Handouts Speech recognition applications
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
1 ASR Applications 1 6.345/HST.728 Automatic Speech Recognition (2010) Speech Recognition Applications Medium-vocabulary ASR for spoken dialogue interaction Example in a telephone-based weather information domain Large-vocabulary ASR for spoken document retrieval Example of academic lecture videos ASR Applications 2 6.345/HST.728 Automatic Speech Recognition (2010) Example Dialogue-based Systems Vocabularies typically have 1000s of words Widely deployed systems tend to be more conservative Directed dialogues have fewer words per utterance Word averages lowered by more conFrmations Human-human conversations use more words
Background image of page 2
2 ASR Applications 3 6.345/HST.728 Automatic Speech Recognition (2010) Telephone-based, Conversational, ASR Telephone bandwidths with variable handsets Noisy background conditions Novice users with small number of interactions Men, women, children Native and non-native speakers Genuine queries, browsers, hackers Spontaneous speech effects e.g., Flled pauses, partial words, non-speech artifacts Out-of-vocabulary words and out-of-domain queries ±ull vocabulary needed for complete understanding Word and phrase spotting are not primary strategies Mixed-initiative dialog provides little constraint to recognizer Real-time decoding ASR Applications 4 6.345/HST.728 Automatic Speech Recognition (2010) Data Collection Issues Data collection has evolved considerably Wizard-based system-based data collection Laboratory deployment public deployment 100s of users thousands millions Data from real users solving real problems accelerates technology development SigniFcantly different from laboratory environment Highlights weaknesses, allows continuous evaluation But, requires systems providing real information! Expanding corpora requires unsupervised training or adaptation to unlabelled data
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
3 ASR Applications 5 6.345/HST.728 Automatic Speech Recognition (2010) Data Collection (Weather Domain) Continuous data collection since 1997 Initial corpus of 3,500 read and 1,000 wizard utterances ASR Applications 6 6.345/HST.728 Automatic Speech Recognition (2010) Weather Corpus Characteristics Approximately 11% of data contained signiFcant noises Over 6% of data contained spontaneous speech effects At least 5% of data from speakerphones Corpus dominated by American male speakers
Background image of page 4
4 ASR Applications 7 6.345/HST.728 Automatic Speech Recognition (2010) Vocabulary Selection Constrained domains naturally limit vocabulary sizes 2000 word vocabulary gives good coverage for weather ~2% out-of-vocabulary rate on test sets ASR Applications 8 6.345/HST.728 Automatic Speech Recognition (2010)
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 6
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 05/08/2010 for the course CS 6.345 taught by Professor Glass during the Spring '10 term at MIT.

Page1 / 20

10asr-handout - Massachusetts Institute of Technology...

This preview shows document pages 1 - 6. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online