CSE
jurafsky&martin_3rdEd_17 (1).pdf

# ˆ b j v k p t t 1 s t o t v k g t j p t t 1 g t j

• 499
• 100% (1) 1 out of 1 people found this document helpful

This preview shows pages 139–141. Sign up to view the full content.

”): ˆ b j ( v k ) = P T t = 1 s . t . O t = v k g t ( j ) P T t = 1 g t ( j ) (9.43) We now have ways in Eq. 9.38 and Eq. 9.43 to re-estimate the transition A and observation B probabilities from an observation sequence O , assuming that we al- ready have a previous estimate of A and B . These re-estimations form the core of the iterative forward-backward algorithm. The forward-backward algorithm (Fig. 9.16 ) starts with some initial estimate of the HMM parameters l = ( A , B ) . We then iteratively run two steps. Like other cases of the EM (expectation-maximization) algorithm, the forward-backward algorithm has two steps: the expectation step, or E-step , and the maximization step, or M-step . E-step M-step In the E-step, we compute the expected state occupancy count g and the expected state transition count x from the earlier A and B probabilities. In the M-step, we use g and x to recompute new A and B probabilities. function F ORWARD -B ACKWARD ( observations of len T , output vocabulary V , hidden state set Q ) returns HMM=(A,B) initialize A and B iterate until convergence E-step g t ( j ) = a t ( j ) b t ( j ) a T ( q F ) 8 t and j x t ( i , j ) = a t ( i ) a i j b j ( o t + 1 ) b t + 1 ( j ) a T ( q F ) 8 t , i , and j M-step ˆ a i j = T - 1 X t = 1 x t ( i , j ) T - 1 X t = 1 N X k = 1 x t ( i , k ) ˆ b j ( v k ) = T X t = 1 s . t . O t = v k g t ( j ) T X t = 1 g t ( j ) return A , B Figure 9.16 The forward-backward algorithm. Although in principle the forward-backward algorithm can do completely unsu- pervised learning of the A and B parameters, in practice the initial conditions are very important. For this reason the algorithm is often given extra information. For example, for speech recognition, in practice the HMM structure is often set by hand, and only the emission ( B ) and (non-zero) A transition probabilities are trained from a set of observation sequences O . Section ?? in Chapter 31 also discusses how initial A and B estimates are derived in speech recognition. We also show that for speech the forward-backward algorithm can be extended to inputs that are non-discrete (“con- tinuous observation densities”).

This preview has intentionally blurred sections. Sign up to view the full version.

140 C HAPTER 9 H IDDEN M ARKOV M ODELS 9.6 Summary This chapter introduced the hidden Markov model for probabilistic sequence clas- sification . Hidden Markov models ( HMMs ) are a way of relating a sequence of obser- vations to a sequence of hidden classes or hidden states that explain the observations. The process of discovering the sequence of hidden states, given the sequence of observations, is known as decoding or inference . The Viterbi algorithm is commonly used for decoding. The parameters of an HMM are the A transition probability matrix and the B observation likelihood matrix. Both can be trained with the Baum-Welch or forward-backward algorithm. Bibliographical and Historical Notes As we discussed at the end of Chapter 4, Markov chains were first used by Markov (1913, 2006) , to predict whether an upcoming letter in Pushkin’s Eugene Onegin would be a vowel or a consonant.
This is the end of the preview. Sign up to access the rest of the document.
• Fall '09

{[ snackBarMessage ]}

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern