10robust-handout

10robust-handout - Massachusetts Institute of Technology...

Info iconThis preview shows pages 1–6. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Massachusetts Institute of Technology Department of Electrical Engineering & Computer Science 6.345/HST.728 Automatic Speech Recognition Spring, 2010 4/27/10 Lecture Handouts • Noise Robustness and Confidence Scoring 1 6.345/HST.728 Automatic Speech Recognition (2010) Noise Robustness and Confidence Scoring 1 Noise Robustness and Confidence Scoring Lecturer: T. J. Hazen • Handling variability in acoustic conditions – Channel compensation – Foreground noises and non-speech artifacts – Background noise compensation • Computing and applying confidence scores – Recognition confidence scoring – Language understanding issues – Dialogue modeling issues 6.345/HST.728 Automatic Speech Recognition (2010) Noise Robustness and Confidence Scoring 2 Typical Digital Speech Recording Unique Vocal Tract Nyquist Filter Quantization Noise Digitization ∗ Digital Signal + Environmental Effects Background Noise Room Reverberation + + Line Noise Channel Filter Channel Effects ∗ + 2 6.345/HST.728 Automatic Speech Recognition (2010) Noise Robustness and Confidence Scoring 3 Motivation • Recognizers make errors • Some reasons for errors: – Presence of previously unseen words or events – Difficult acoustic conditions or background noises – Presence of highly confusable words – Insufficient amount of training data – Mismatch between training and testing data – Models too rigid to handle variability • Methods to handling error-full data – Adjust or adapt to current conditions – Identify when errors occur and perform action to recover 6.345/HST.728 Automatic Speech Recognition (2010) Noise Robustness and Confidence Scoring 4 Noises and Non-Speech Artifacts • Non-speech artifacts can be extremely varied – Background noises (music, dog bark, door slam, etc.) – Microphone and channel noises (clicks, beeps, static, etc.) – Non-lexical speaker noises (cough, laugh, lip smack, etc.) • Noises can be simultaneous with speech 3 6.345/HST.728 Automatic Speech Recognition (2010) Noise Robustness and Confidence Scoring 5 Recognition Experiments • Experiments w/ baseline JUPITER recognizer – Clean: Utterances with no OOV words and no non-speech artifacts – With Noise: Utterances containing at least one non-speech artifact – With OOV: Utterances containing at least one OOV word 18.9 38.3 9.5 22.4 47.0 77.7 72.9 100 10 20 30 40 50 60 70 80 90 100 Word Error Rate (%) Sentence Error Rate (%) With Noise With OOV Clean All Test Data 6.345/HST.728 Automatic Speech Recognition (2010) Noise Robustness and Confidence Scoring 6 Difficult Channel and Noise Conditions • Variable system functions – From different channels (e.g., land line, cellular, etc.) – Different microphones • Constant background noise – Channel static – Car engine noise – Air conditioning hiss • Intermittent foreground or background noises – Cough – Laugh – Door slam – Handset taps or clicks – Phone ringing – Dog barking 4 6.345/HST.728 Automatic Speech Recognition (2010)6....
View Full Document

This note was uploaded on 05/08/2010 for the course CS 6.345 taught by Professor Glass during the Spring '10 term at MIT.

Page1 / 26

10robust-handout - Massachusetts Institute of Technology...

This preview shows document pages 1 - 6. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online