{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

HMM-handout1

# HMM-handout1 - Massachusetts Institute of Technology...

This preview shows pages 1–7. Sign up to view the full content.

Massachusetts Institute of Technology Department of Electrical Engineering & Computer Science 6.345/HST.728 Automatic Speech Recognition Spring, 2010 4/6/10 Lecture Handouts Hidden Markov Models (HMMs) Reading: Rabiner, “A Tutorial on Hidden Markov Models,” 1989.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
4/2/10 1 6.345/HST.728 Automatic Speech Recognition (2010) HMMs 1 Hidden Markov Models for Speech Recognition and Acoustic Model Training Larry Gillick April 6, 8, and 13, 2010 6.345/HST.728 Automatic Speech Recognition (2010) HMMs 2 How to do Speech Recognition Let W = sequence of words Let Y = sequence of frames (acoustics) Our goal is to compute P(W=w|Y=y) for all w Then compute the “recognized transcript” as follows: w max = arg max w P ( W = w | Y = y )
4/2/10 2 6.345/HST.728 Automatic Speech Recognition (2010) HMMs 3 “The Fundamental Equation of Speech Recognition” Actually, it ʼ s just Bayes ʼ Theorem! P(y|w) is supplied by the acoustic model P(w) is supplied by the language model P(y) is a (largely irrelevant) normalizing constant We must obtain a formula for P(y|w) P ( w | y ) = P ( y | w ) P ( w ) P ( y ) 6.345/HST.728 Automatic Speech Recognition (2010) HMMs 4 Comments on our equation P(w) can be regarded as a prior on word strings P(w|y) is the posterior on word strings, given that we ʼ ve seen the acoustics But why don ʼ t we model P(w|y) directly? Much harder! P(w|y) is not local: need to consider all possible word strings! P(y|w): focuses on the frame sequence for the single string w

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
4/2/10 3 6.345/HST.728 Automatic Speech Recognition (2010) HMMs 5 Working in the log domain Probabilities can get very small when carrying out successive multiplies So we rewrite our formula as follows: In these lectures, however, we ʼ ll avoid the log domain for reasons of clarity log P ( w | y ) = log P ( y | w ) + log P ( w ) log P ( y ) or, if we ignore the term that is indep of w, S Total ( w ) = S AM ( w ) + S LM ( w ) TotScore = AcModScore + LangModScore 6.345/HST.728 Automatic Speech Recognition (2010) HMMs 6 The Acoustic Model We need a formula for P(y|w), the acoustic model probability This will be provided via the use of a Hidden Markov Model This model makes certain assumptions of dubious validity “Independence” assumptions Nevertheless, the model has been very successful
4/2/10 4 6.345/HST.728 Automatic Speech Recognition (2010) HMMs 7 Stochastic process A stochastic process is a sequence of random variables or vectors, defined on the same sample space, either in discrete time: Or in continuous time: { X 0 , X 1 , , X t , } { X t : t 0} 6.345/HST.728 Automatic Speech Recognition (2010) HMMs 8 Markov Chains A Markov Chain is a discrete state and discrete time stochastic process satisfying the so- called Markov Assumption “The future depends on the past only through the present” We ʼ ll assume the X ʼ s take on a finite set of values -- those are the possible states P ( X n + 1 = x n + 1 | X n = x n , , X 0 = x 0 ) = P ( X n + 1 = x n + 1 | X n = x n )

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
4/2/10 5 6.345/HST.728 Automatic Speech Recognition (2010) HMMs 9 Transition functions The probability of transitioning from one state a to another state b
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 21

HMM-handout1 - Massachusetts Institute of Technology...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online