224s.09.lec8

224s.09.lec8 - CS224S/LINGUIST281...

Info iconThis preview shows pages 1–13. Sign up to view the full content.

View Full Document Right Arrow Icon
CS 224S / LINGUIST 281 Speech Recognition, Synthesis, and  Dialogue Dan Jurafsky Lecture 6: Forward-Backward (Baum-Welch) and  Word Error Rate IP Notice:
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Outline for Today Speech Recognition Architectural Overview Hidden Markov Models in general and for  speech Forward Viterbi Decoding How this fits into the ASR component of course July 27 (today): HMMs, Forward, Viterbi, Jan 29 Baum-Welch (Forward-Backward) Feb 3: Feature Extraction, MFCCs Feb 5: Acoustic Modeling and GMMs Feb 10: N-grams and Language Modeling Feb 24: Search and Advanced Decoding Feb 26: Dealing with Variation
Background image of page 2
LVCSR Large Vocabulary Continuous Speech  Recognition ~20,000-64,000 words Speaker independent (vs. speaker- dependent) Continuous speech (vs isolated-word)
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Viterbi trellis for “five”
Background image of page 4
Viterbi trellis for “five”
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Search space with bigrams
Background image of page 6
Viterbi trellis  7
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Viterbi backtrace 8
Background image of page 8
The Learning Problem Baum-Welch  =  Forward-Backward  Algorithm  (Baum 1972) Is a special case of the EM or Expectation- Maximization algorithm (Dempster, Laird,  Rubin) The algorithm will let us train the transition  probabilities A= {a ij } and the emission  probabilities B={b i (o t )} of the HMM
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Input to Baum-Welch O unlabeled sequence of observations Q vocabulary of hidden states For ice-cream task O = {1,3,2.,,,.} Q = {H,C}
Background image of page 10
Starting out with Observable  Markov Models How to train? Run the model on observation sequence O. Since it’s not hidden, we know which states  we went through, hence which transitions  and observations were  used. Given that information, training: B = {b k (o t )}:  Since every state can only generate  one observation symbol, observation likelihoods B  are all 1.0 A = {a ij }: a ij = C ( i j ) C ( i q ) q Q
Background image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Extending Intuition to HMMs For HMM, cannot compute these counts  directly from observed sequences Baum-Welch intuitions: Iteratively  estimate the counts.  Start with an estimate for a ij  and b k , iteratively  improve the estimates
Background image of page 12
Image of page 13
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 04/21/2011 for the course CS 224 taught by Professor De during the Spring '11 term at Kentucky.

Page1 / 38

224s.09.lec8 - CS224S/LINGUIST281...

This preview shows document pages 1 - 13. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online