Suboptimal way 1 2 3 4 5 for t 1 t do compute t j

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: i) for all t? suboptimal way: 1 2 3 4 5 for t = 1 . . . T do Compute αt (j ) starting at time 1 up to time t Compute βt (j ) starting at time T down to time t Compute γt (j ) Use γt (j ) as needed (e.g., for parameter learning). Prof. Jeff Bilmes EE596A/Winter 2013/DGMs – Lecture 4 - Jan 23rd, 2013 page 4-39 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ HMM, posteriors How best to compute γt (i) for all t? suboptimal way: 1 2 3 4 5 for t = 1 . . . T do Compute αt (j ) starting at time 1 up to time t Compute βt (j ) starting at time T down to time t Compute γt (j ) Use γt (j ) as needed (e.g., for parameter learning). But this is extremely wasteful. Once we have computed αt (j ), for time t, should hold on to it for the next time step, αt+1 (j ). Similarly, once we have βt (j ) save it for previous time βt−1 (j ). Prof. Jeff Bilmes EE596A/Winter 2013/DGMs – Lecture 4 - Jan 23rd, 2013 page 4-39 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ HMM, posteriors How best to compute γt (i) for all t? suboptimal way: 1 2 3 4 5 for t = 1 . . . T do Compute αt (j ) starting at time 1 up to time t Compute βt (j ) starting at time T down to time t Compute γt (j ) Use γt (j ) as needed (e.g., for parameter learning). But this is extremely wasteful. Once we have computed αt (j ), for time t, should hold on to it for the next time step, αt+1 (j ). Similarly, once we have βt (j ) save it for previous time βt−1 (j ). Dynamic programming: optimal substructure & common subproblems of αt (j ), exactly the complete computation of αt−1 (j ). Prof. Jeff Bilmes EE596A/Winter 2013/DGMs – Lecture 4 - Jan 23rd, 2013 page 4-39 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ HMM, posteriors How best to compute γt (i) for all t? suboptimal way: 1 2 3 4 5 for t = 1 . . . T do Compute αt (j ) starting at time 1 up to time t Compute βt (j ) starting at time T down to time t Compute γt (j ) Use γt (j ) as needed (e.g., for parameter learning). But this is extremely wasteful. Once we have computed αt (j ), for time t, should hold on to it for the next time step, αt+1 (j ). Similarly, once we have βt (j ) save it for previous time βt−1 (j ). Dynamic programming: optimal substructure & common subproblems of αt (j ), exactly the complete computation of αt−1 (j ). This is clear when viewed as messages in the GM (messages don’t proceed until they have received appropriate incoming messages). Prof. Jeff Bilmes EE596A/Winter 2013/DGMs – Lecture 4 - Jan 23rd, 2013 page 4-39 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ HMMs and message passing/LBP Generic message definition for arbitrary p ∈ F (G, R) µi→j (xj ) = ψi,j (xi , xj ) xi µk→i (xi ) (4.30) k∈δ (i)\{j } If graph is a tree, and if we obey message-passing protocol order, then we will reach a point where we’ve got marginals. I.e., p(xi ) ∝ µj →i (xi ) (4.31) j ∈δ (i) and p(xi , xj ) ∝ ψi,j (xi , xj ) µk→i (xi ) k∈δ (i)\{j } Prof. Jeff Bilmes µ →j (xj ) (4.32) ∈δ (j )\{i} EE596A/Winter 2013/DGMs – Lecture 4 - Jan 23rd,...
View Full Document

Ask a homework question - tutors are online