Ch5-Automatic Speech Recognition-Advanced Topics

Ch5-Automatic Speech Recognition-Advanced Topics - Search...

Info iconThis preview shows pages 1–6. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Search and Decoding in Speech Recognition Automatic Speech Recognition Advanced Topics February 13, 2012 Veton Kpuska 2 Speech Recognition Systems Architecture The task of speech recognition is to take as input an acoustic waveform and produce as output a string of words. HMM-based speech recognition systems view this task using the metaphor of the noisy channel. The intuition of the noisy channel model is to treat the acoustic waveform as a noisy version of the string of words, i.e.. a version that has been passed through a noisy communications channel February 13, 2012 Veton Kpuska 3 The learning and knowledge that we have, is, at the most, but little compared with that of which we are ignorant ! Noisy Channel View of SR ASR Decoder ASR Decoder Noise Guess at Original Sentence: The learning and knowledge that we have The leaning over the edge The leaning over the hedge The learning and knowledge February 13, 2012 Veton Kpuska 4 Noisy Channel View of SR This channel introduces noise which makes it hard to recognize the true string of words. Our goal is then to build a model of the channel so that we can figure out how it modified this true sentence and hence recover it. Noise Channel view absorbs all variability's of the speech mentioned earlier including true noise. Having insight of the noisy channel model means that we know how the channel distorts the source, we could find the correct source sentence for a waveform by taking every possible sentence in the language, running each sentence through our noisy channel model, and seeing if it matches the output. We then select the best matching source sentence as our desired source sentence. February 13, 2012 Veton Kpuska 5 Noisy Channel View of SR Implementing the noisy-channel model as we have expressed it in previous slide requires solutions to two problems. 1. First, in order to pick the sentence that best matches the noisy input we will need a complete metric for a best match. Because speech is so variable, an acoustic input sentence will never exactly match any model we have for this sentence. As we have suggested in previous chapters, we will use probability as our metric. This makes the speech recognition problem a special case of Bayesian inference , a method known since the work of Bayes (1763). Bayesian inference or Bayesian classification was applied successfully by the 1950s to language problems like optical character recognition (Bledsoe and Browning, 1959) and to authorship attribution tasks like the seminal...
View Full Document

This note was uploaded on 02/11/2012 for the course ECE 5526 taught by Professor Staff during the Summer '09 term at FIT.

Page1 / 37

Ch5-Automatic Speech Recognition-Advanced Topics - Search...

This preview shows document pages 1 - 6. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online