# Lecture5

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: to be generatively accurate, we can derive necessary conditions on the HMM, e.g., number of required states. Recall, nth -order Markov chain convertable to 1st-order one. Prof. Jeﬀ Bilmes EE596A/Winter 2013/DGMs – Lecture 5 - Jan 25th, 2013 page 5-73 (of 232) HMMs Trellis Other HMM queries MPE Sampling What HMMs can do Summary Scratch HMMs Generative Accuracy For an HMM to be generatively accurate, we can derive necessary conditions on the HMM, e.g., number of required states. Recall, nth -order Markov chain convertable to 1st-order one. if D(p(x1:T )||ph (x1:T )) = 0, then the following mutual information quantities must be equal (5.62) I (XS1 ; XS2 ) = Ih (XS1 ; XS2 ) and where S1 , S2 ⊆ 1 : T , and where I (·; ·) is true mutual information, and Ih (·; ·) is the mutual information under the HMM Ih (XS1 ; XS2 ) = ph (xS1 , xS2 ) log xS1 ∪S2 Prof. Jeﬀ Bilmes ph (xS1 , xS2 ) ph (xS1 )ph (xS2 ) EE596A/Winter 2013/DGMs – Lecture 5 - Jan 25th, 2013 page (5.63) 5-73 (of 232) HMMs Trellis Other HMM queries MPE Sampling What HMMs can do Summary Scratch HMMs Generative Accuracy For an HMM to be generatively accurate, we can derive necessary conditions on the HMM, e.g., number of required states. Recall, nth -order Markov chain convertable to 1st-order one. if D(p(x1:T )||ph (x1:T )) = 0, then the following mutual information quantities must be equal (5.62) I (XS1 ; XS2 ) = Ih (XS1 ; XS2 ) and where S1 , S2 ⊆ 1 : T , and where I (·; ·) is true mutual information, and Ih (·; ·) is the mutual information under the HMM Ih (XS1 ; XS2 ) = ph (xS1 , xS2 ) log xS1 ∪S2 ph (xS1 , xS2 ) ph (xS1 )ph (xS2 ) (5.63) ∆ Deﬁne X¬t = {X1 , X2 , . . . , Xt−1 , Xt+1 , . . . , XT } (i.e., ¬t is set of all indices sans t). Prof. Jeﬀ Bilmes EE596A/Winter 2013/DGMs – Lecture 5 - Jan 25th, 2013 page 5-73 (of 232) HMMs Trellis Other HMM queries MPE Sampling What HMMs can do Summary Scratch HMMs Generative Accuracy Theorem 5.8.1 (Necessary conditions for generative HMM accuracy.) An HMM with joint observation distribution ph (X1:T ) will accurately model the true distribution p(X1:T ) only if the following three conditions hold for all t: Ih (X¬t ; Qt ) ≥ I (Xt ; X¬t ), Ih (Qt ; Xt ) ≥ I (Xt ; X¬t ), and |DQ | ≥ 2I (Xt ;X¬t ) where Ih (X¬t ; Qt ) (resp. Ih (Qt ; Xt )) is the information transmission rate between X¬t and Qt (resp. Qt and Xt ) under an HMM, and I (Xt ; X¬t ) is the true information transmission rate between I (Xt ; X¬t ). Prof. Jeﬀ Bilmes EE596A/Winter 2013/DGMs – Lecture 5 - Jan 25th, 2013 page 5-74 (of 232) HMMs Trellis Other HMM queries MPE Sampling What HMMs can do Summary Scratch Proof: HMMs Generative Accuracy Proof. Accurate HMM (i.e., zero KL-divergence from true distribution) implies I (X¬t ; Xt ) = Ih (X¬t ; Xt ). We expand Ih (X¬t ; Qt , Xt ) in two ways using the chain rule of mutual information: (5.64) Ih (X¬t ; Qt , Xt ) = Ih (X¬t ; Qt ) + Ih (X¬t ; Xt |Qt ) (5.65) = I (X¬t ; Xt ) + Ih (X¬t ; Qt |Xt ) (5.67) = Ih (...
View Full Document

## This document was uploaded on 04/05/2014.

Ask a homework question - tutors are online