We expand ih xt qt xt in two ways using the chain rule

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: X¬t ; Xt ) + Ih (X¬t ; Qt |Xt ) (5.66) The HMM conditional independence properties say that Ih (X¬t ; Xt |Qt ) = 0, implying Ih (X¬t ; Qt ) = I (X¬t ; Xt ) + Ih (X¬t ; Qt |Xt ) Prof. Jeff Bilmes EE596A/Winter 2013/DGMs – Lecture 5 - Jan 25th, 2013 (5.68) page 5-75 (of 232) HMMs Trellis Other HMM queries MPE Sampling What HMMs can do Summary Scratch Proof cont.: HMMs Generative Accuracy ... cont. or that Ih (X¬t ; Qt ) ≥ I (X¬t ; Xt ) (5.69) since Ih (X¬t ; Qt |Xt ) ≥ 0. This is the first condition. Similarly, the quantity Ih (Xt ; Qt , X¬t ) may be expanded as follows: (5.70) Ih (Xt ; Qt , X¬t ) = Ih (Xt ; Qt ) + Ih (Xt ; X¬t |Qt ) (5.71) = I (Xt ; X¬t ) + Ih (Xt ; Qt |X¬t ) (5.72) Reasoning as above, this leads to Ih (Xt ; Qt ) ≥ I (Xt ; X¬t ), (5.73) the second condition. Prof. Jeff Bilmes EE596A/Winter 2013/DGMs – Lecture 5 - Jan 25th, 2013 page 5-76 (of 232) HMMs Trellis Other HMM queries MPE Sampling What HMMs can do Summary Scratch Proof cont.: HMMs Generative Accuracy ... cont. A sequence of inequalities establishes the third condition: log |DQ | ≥ H (Qt ) ≥ H (Qt ) − H (Qt |Xt ) = Ih (Qt ; Xt ) ≥ I (Xt ; X¬t ) so |DQ | ≥ 2I (Xt ;X¬t ) . This is a lower bound - the number of states must have enough capacity so that it is not a bottleneck, at the very least! This could be quite large, and grow with T . r.h.s. I (Xt ; X¬t ) is upper bounded by H (X¬t ) which could be as bad as log |DX¬t | Prof. Jeff Bilmes EE596A/Winter 2013/DGMs – Lecture 5 - Jan 25th, 2013 page 5-77 (of 232) HMMs Trellis Other HMM queries MPE Sampling What HMMs can do Summary Scratch Nec. conditions for HMMs Generative Accuracy Insufficient states can lead to model inaccuracies (e.g., state duration distribution using a geometric rather than something more realistic, add states to improve duration distribution while sharing observation parameters) Observation density family must be rich enough (2nd inequality) Two bottlenecks: observation density (e.g., number of Components of a Gaussian mixture), and time-dependency (number of states). Prof. Jeff Bilmes EE596A/Winter 2013/DGMs – Lecture 5 - Jan 25th, 2013 page 5-78 (of 232) HMMs Trellis Other HMM queries MPE Sampling What HMMs can do Summary Scratch Suff conditions for HMMs Generative Accuracy Theorem 5.8.2 Sufficient conditions for HMM accuracy. An HMM ph (X1:T ) will accurately represent a true discrete distribution p(X1:T ) if the following conditions hold for all t: H (Qt |X<t ) = 0 ph (Xt = xt |qx<t ) = p(Xt = xt |X<t = x<t ). where qx<t = f (x<t ) is the unique state sub-sequence associated with x<t . Quite strong and unrealistic requirements, but they guarantee accuracy nonetheless. ∆ Note {< t} = {1, 2, . . . , t − 1} Prof. Jeff Bilmes EE596A/Winter 2013/DGMs – Lecture 5 - Jan 25th, 2013 page 5-79 (of 232) HMMs Trellis Other HMM queries MPE Sampling What HMMs can do Summary Scratch Proof: Suff conds for HMMs Generative Accuracy Proof. We have for all t: D(p(Xt |X<t )||ph (Xt |X<t )) p(xt |x<t ) = p(x1:t ) log ph (xt |x<t ) x (5.74) (5.75) 1:t = p(x1:t ) log x1:t p(xt |...
View Full Document

This document was uploaded on 04/05/2014.

Ask a homework question - tutors are online