Ch6-Design of HMM SRS

Ch6-Design of HMM SRS - Speech Recognition Designing...

Info iconThis preview shows pages 1–7. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Speech Recognition Designing HMM-based ASR Systems Lecture Based on: Dr. Rita Singhs Notes School of Computer Science Carnegie Mellon University February 16, 2012 Veton Kpuska 2 Outline Isolated word recognition Bayesian Classification Decoding based on best state sequences Word sequence recognition Bayesian Classification Decoding based on best state sequences Best path decoding and collapsing graphs Search strategies Depth first Breadth first Optimal graph structures Constrained language decoding Natural language decoding HMMs for the language modeling Unigram Bigram Trigram February 16, 2012 Veton Kpuska 3 Statistical Pattern Recognition Given data X , find which of a number of classes C 1 , C 2 ,C N it belongs to, based on known distributions of data from C 1 , C 2 , etc. Bayesian Classification: The a priori probability accounts for the relative proportions of the classes If you never saw any data, you would guess the class based on these probabilities alone P( X |C j ) accounts for evidence obtained from observed data X ( 29 ( 29 j j j i C P C X P i C Class | max arg : = = Probability of X as given by the probability distribution C j a priori probability of C j February 16, 2012 Veton Kpuska 4 Statistical Classification of Isolated Words Classes are words Data are instances of isolated spoken words Sequence of feature vectors derived from speech signal, typically 1 vector from a 25ms frame of speech, with frames shifted by 10 ms. Bayesian classification: Recognized_Word = argmax word P( word )P( X | word ) P( word ) is a priori probability of word Obtained from our expectation of the relative frequency of occurrence of the word P( X | word ) is the probability of X computed on the probability distribution function of word February 16, 2012 Veton Kpuska 5 Computing P( X | word ) To compute P( X | word ), there must be a statistical distribution for X corresponding to word Each word must be represented by some statistical model. We represent each word by an HMM An HMM is really a graphical form of probability density function for time-varying February 16, 2012 Veton Kpuska 6 Computing P( X | word ) Each state has a probability distribution function Transitions between states are governed by transition probabilities At each time instant the model is in some state, and it emits one observation vector from the distribution associated with that state The actual state sequence that generated X is never known....
View Full Document

Page1 / 271

Ch6-Design of HMM SRS - Speech Recognition Designing...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online