This preview shows page 1. Sign up to view the full content.
Unformatted text preview: r all t:
Ih (X¬t ; Qt ) ≥ I (Xt ; X¬t ),
Ih (Qt ; Xt ) ≥ I (Xt ; X¬t ), and
DQ  ≥ 2I (Xt ;X¬t )
where Ih (X¬t ; Qt ) (resp. Ih (Qt ; Xt )) is the information transmission rate
between X¬t and Qt (resp. Qt and Xt ) under an HMM, and I (Xt ; X¬t )
is the true information transmission rate between I (Xt ; X¬t ). Prof. Jeﬀ Bilmes EE596A/Winter 2013/DGMs – Lecture 4  Jan 23rd, 2013 page 478 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ Proof: HMMs Generative Accuracy
Proof.
Accurate HMM (i.e., zero KLdivergence from true distribution) implies
I (X¬t ; Xt ) = Ih (X¬t ; Xt ). We expand Ih (X¬t ; Qt , Xt ) in two ways
using the chain rule of mutual information:
Ih (X¬t ; Qt , Xt ) (4.80) = Ih (X¬t ; Qt ) + Ih (X¬t ; Xt Qt ) (4.81) = Ih (X¬t ; Xt ) + Ih (X¬t ; Qt Xt ) (4.82) = I (X¬t ; Xt ) + Ih (X¬t ; Qt Xt ) (4.83) The HMM conditional independence properties say that
Ih (X¬t ; Xt Qt ) = 0, implying
Ih (X¬t ; Qt ) = I (X¬t ; Xt ) + Ih (X¬t ; Qt Xt ) Prof. Jeﬀ Bilmes EE596A/Winter 2013/DGMs – Lecture 4  Jan 23rd, 2013 (4.84) page 479 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ Proof cont.: HMMs Generative Accuracy
... cont.
or that
Ih (X¬t ; Qt ) ≥ I (X¬t ; Xt ) (4.85) since Ih (X¬t ; Qt Xt ) ≥ 0. This is the ﬁrst condition. Similarly, the
quantity Ih (Xt ; Qt , X¬t ) may be expanded as follows:
Ih (Xt ; Qt , X¬t ) (4.86) = Ih (Xt ; Qt ) + Ih (Xt ; X¬t Qt ) (4.87) = I (Xt ; X¬t ) + Ih (Xt ; Qt X¬t ) (4.88) Reasoning as above, this leads to
Ih (Xt ; Qt ) ≥ I (Xt ; X¬t ), (4.89) the second condition.
Prof. Jeﬀ Bilmes EE596A/Winter 2013/DGMs – Lecture 4  Jan 23rd, 2013 page 480 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ Proof cont.: HMMs Generative Accuracy
... cont.
A sequence of inequalities establishes the third condition:
log DQ  ≥ H (Qt ) ≥ H (Qt ) − H (Qt Xt )
= Ih (Qt ; Xt ) ≥ I (Xt ; X¬t )
so DQ  ≥ 2I (Xt ;X¬t ) .
This is a lower bound  the numer of states must have enough
capacity so that it is not a bottleneck, at the very least!
This could be quite large, and grow with T .
r.h.s. I (Xt ; X¬t ) is upper bounded by H (X¬t ) which could be as
bad as log DX¬t 
Prof. Jeﬀ Bilmes EE596A/Winter 2013/DGMs – Lecture 4  Jan 23rd, 2013 page 481 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ Nec. conds for HMMs Generative Accuracy Insuﬃcient states can lead to model inaccuracies (e.g., state
duration distribution using a geometric rather than something more
realistic, add states to improve duration distribution while sharing
observation parameters)
Observation density family must be rich enough (2nd inequality)
Two bottlenecks: observation density (e.g., number of Components
of a Gaussian mixture), and timedependency (number of states). Prof. Jeﬀ Bilmes EE596A/Winter 2013/DGMs – Lecture 4  Jan 23rd, 2013 page 482 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ...
View
Full
Document
This document was uploaded on 04/05/2014.
 Winter '14

Click to edit the document details