This preview shows page 1. Sign up to view the full content.
Unformatted text preview: i) for all t?
suboptimal way:
1
2
3
4
5 for t = 1 . . . T do
Compute αt (j ) starting at time 1 up to time t
Compute βt (j ) starting at time T down to time t
Compute γt (j )
Use γt (j ) as needed (e.g., for parameter learning). Prof. Jeﬀ Bilmes EE596A/Winter 2013/DGMs – Lecture 4  Jan 23rd, 2013 page 439 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ HMM, posteriors
How best to compute γt (i) for all t?
suboptimal way:
1
2
3
4
5 for t = 1 . . . T do
Compute αt (j ) starting at time 1 up to time t
Compute βt (j ) starting at time T down to time t
Compute γt (j )
Use γt (j ) as needed (e.g., for parameter learning).
But this is extremely wasteful. Once we have computed αt (j ), for
time t, should hold on to it for the next time step, αt+1 (j ).
Similarly, once we have βt (j ) save it for previous time βt−1 (j ). Prof. Jeﬀ Bilmes EE596A/Winter 2013/DGMs – Lecture 4  Jan 23rd, 2013 page 439 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ HMM, posteriors
How best to compute γt (i) for all t?
suboptimal way:
1
2
3
4
5 for t = 1 . . . T do
Compute αt (j ) starting at time 1 up to time t
Compute βt (j ) starting at time T down to time t
Compute γt (j )
Use γt (j ) as needed (e.g., for parameter learning).
But this is extremely wasteful. Once we have computed αt (j ), for
time t, should hold on to it for the next time step, αt+1 (j ).
Similarly, once we have βt (j ) save it for previous time βt−1 (j ).
Dynamic programming: optimal substructure & common
subproblems of αt (j ), exactly the complete computation of αt−1 (j ). Prof. Jeﬀ Bilmes EE596A/Winter 2013/DGMs – Lecture 4  Jan 23rd, 2013 page 439 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ HMM, posteriors
How best to compute γt (i) for all t?
suboptimal way:
1
2
3
4
5 for t = 1 . . . T do
Compute αt (j ) starting at time 1 up to time t
Compute βt (j ) starting at time T down to time t
Compute γt (j )
Use γt (j ) as needed (e.g., for parameter learning).
But this is extremely wasteful. Once we have computed αt (j ), for
time t, should hold on to it for the next time step, αt+1 (j ).
Similarly, once we have βt (j ) save it for previous time βt−1 (j ).
Dynamic programming: optimal substructure & common
subproblems of αt (j ), exactly the complete computation of αt−1 (j ).
This is clear when viewed as messages in the GM (messages don’t
proceed until they have received appropriate incoming messages). Prof. Jeﬀ Bilmes EE596A/Winter 2013/DGMs – Lecture 4  Jan 23rd, 2013 page 439 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ HMMs and message passing/LBP
Generic message deﬁnition for arbitrary p ∈ F (G, R)
µi→j (xj ) = ψi,j (xi , xj )
xi µk→i (xi ) (4.30) k∈δ (i)\{j } If graph is a tree, and if we obey messagepassing protocol order,
then we will reach a point where we’ve got marginals. I.e.,
p(xi ) ∝ µj →i (xi ) (4.31) j ∈δ (i) and
p(xi , xj ) ∝ ψi,j (xi , xj ) µk→i (xi )
k∈δ (i)\{j } Prof. Jeﬀ Bilmes µ →j (xj ) (4.32) ∈δ (j )\{i} EE596A/Winter 2013/DGMs – Lecture 4  Jan 23rd,...
View Full
Document
 Winter '14

Click to edit the document details