This preview shows page 1. Sign up to view the full content.
Unformatted text preview: – Lecture 4  Jan 23rd, 2013 page 453 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ Edge Marginals
Need p(qt , qt+1 x1:T ) for learning tasks
Easy to obtain with both the α and β quantities, since:
p(qt−1 , qt x1:T ) = p(qt−1 , qt , x1:T )
p(x1:T ) (4.37) and
p(qt−1 , qt , x1:T ) (4.38) = p(xt qt )p(qt−1 , qt , x1:t−1 , xt+1:T )
= p(xt qt )p(xt+1:T qt )p(qt , qt−1 , x1:t−1 ) (4.40) = p(xt qt )βt (qt )p(qt qt−1 , x1:t−1 )p(qt−1 , x1:t−1 ) (4.41) = p(xt qt )βt (qt )p(qt qt−1 )αt−1 (qt−1 ) Prof. Jeﬀ Bilmes (4.39) (4.42) EE596A/Winter 2013/DGMs – Lecture 4  Jan 23rd, 2013 page 453 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ Edge Marginals So all of the edge marginals can be computed using the standard
recursions. Prof. Jeﬀ Bilmes EE596A/Winter 2013/DGMs – Lecture 4  Jan 23rd, 2013 page 454 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ Edge Marginals So all of the edge marginals can be computed using the standard
recursions.
There are several aspects of HMMs we will discuss: 1) how ﬂexible
are HMMs, 2) realworld inference in HMMs (what to do when state
space gets large), 3) timespace tradeoﬀs. Prof. Jeﬀ Bilmes EE596A/Winter 2013/DGMs – Lecture 4  Jan 23rd, 2013 page 454 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ What HMMs can do HMMs are more powerful than you might think. We’ll see that
many DGMs can be represented by HMMs, so before we move on to
DGMs, we should understand how ﬂexible HMMs are (and then as
we go through course, we’ll see what the penalties are for making
such HMM representations).
We next visit a set of properties about HMMs that should be
remembered. Prof. Jeﬀ Bilmes EE596A/Winter 2013/DGMs – Lecture 4  Jan 23rd, 2013 page 455 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ Observations are not i.i.d.
Joint probability under an HMM:
p(Xt:t+h = xt:t+h )
t+h p(Xj = xj Qj = qj )aqj −1 qj . =
qt:t+h j =t Unless only one state is possible, observations do not factorize.
In an HMM, there are no statements of the form A⊥ B .
⊥
HMMs have conditional independence properties (like all DGMs
have).
p(Xt:t+h = xt:t+h Qt:t+h = qt:t+h )
t+h p(Xτ = xτ Qτ = qτ ). =
τ =t
Prof. Jeﬀ Bilmes EE596A/Winter 2013/DGMs – Lecture 4  Jan 23rd, 2013 page 456 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ Observations are not “Viterbi i.i.d.”
The Viterbi path (mostprobable explanation) of an HMM is deﬁned:
∗
q1:T = argmax q1:T p(X1:T = x1:T , q1:T ) Prof. Jeﬀ Bilmes EE596A/Winter 2013/DGMs – Lecture 4  Jan 23rd, 2013 page 457 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ Observations are not “Viterbi i.i.d.”
The Viterbi path (mostprobable explanation) of an HMM is deﬁned:
∗
q1:T = argmax q1:T p(X1:T = x1:T , q1:T ) If we maxmarginalize over the hidden states, does that lead to i.i.d.
distribution? Prof. Jeﬀ Bilmes EE596A/Winter 2013/DGMs – Lecture 4  Jan 23...
View
Full
Document
This document was uploaded on 04/05/2014.
 Winter '14

Click to edit the document details