This is just a dierent semi ring the resulting

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: rd, 2013 page 4-57 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ Observations are not “Viterbi i.i.d.” The Viterbi path (most-probable explanation) of an HMM is defined: ∗ q1:T = argmax q1:T p(X1:T = x1:T , q1:T ) If we max-marginalize over the hidden states, does that lead to i.i.d. distribution? The “Viterbi” distribution of the HMM is: pvit (X1:T = x1:T ) = c p(X1:T = x1:T , Q1:T = (4.43) ∗ q1:T ) = c max p(X1:T = x1:T , Q1:T = q1:T ) q1:T T p(Xt = xt |Qt = qt )p(Qt = qt |Qt−1 = qt−1 ) = c max q1:T t=1 where c is a positive normalizing constant over x1:T . This is just a different semi-ring. The resulting distribution over observations does not in general factorize, so no i.i.d. here either. Prof. Jeff Bilmes EE596A/Winter 2013/DGMs – Lecture 4 - Jan 23rd, 2013 page 4-57 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ Observations are not “Viterbi i.i.d.” The Viterbi path (most-probable explanation) of an HMM is defined as follows: ∗ q1:T ∈ argmax q1:T p(X1:T = x1:T , q1:T ) Prof. Jeff Bilmes EE596A/Winter 2013/DGMs – Lecture 4 - Jan 23rd, 2013 (4.44) page 4-58 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ Observations are not “Viterbi i.i.d.” The Viterbi path (most-probable explanation) of an HMM is defined as follows: ∗ q1:T ∈ argmax q1:T p(X1:T = x1:T , q1:T ) (4.44) This is a standard method for finding a mapping from observations ∗ x1:T to candidate answers q1:T , also called a “decoding.” ¯ Prof. Jeff Bilmes EE596A/Winter 2013/DGMs – Lecture 4 - Jan 23rd, 2013 page 4-58 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ Observations are not “Viterbi i.i.d.” The Viterbi path (most-probable explanation) of an HMM is defined as follows: ∗ q1:T ∈ argmax q1:T p(X1:T = x1:T , q1:T ) (4.44) This is a standard method for finding a mapping from observations ∗ x1:T to candidate answers q1:T , also called a “decoding.” ¯ Other times, we might have k HMMs, corresponding to k classes {pk (x1:T , q1:T ) : k = 1 . . . K } (4.45) and we wish to perform classification amongst the K discrete set of classes as follows ∗ (k ∗ , q1:T ) = argmax pk (x1:T , q1:T ) (4.46) q1:T ,k and k ∗ becomes part of the hypothesized answer. Prof. Jeff Bilmes EE596A/Winter 2013/DGMs – Lecture 4 - Jan 23rd, 2013 page 4-58 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ Observations are not “Viterbi i.i.d.” We can view this as K distributions over just the observations, i.e., ∗ pk (x1:T ) ∝ pk (x1:T , q1:,k ) vit T (4.47) ∝ max pk (x1:T , q1:T ) (4.48) q1:T T ∝ max q1:T Prof. Jeff Bilmes pk (xt |qt )pk (qt |qt−1 ) (4.49) t=1 EE596A/Winter 2013/DGMs – Lecture 4 - Jan 23rd, 2013 page 4-59 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ Observations are not “Viterbi i.i.d.” We can view this as K distributions over just the observations, i.e., ∗ pk (x1:T ) ∝ pk (x1:T , q1:,k ) vit T (4.47) ∝ max pk (x1:T , q1:T ) (4.48) q1:T T ∝ max q1:T pk (xt |qt )pk (...
View Full Document

Ask a homework question - tutors are online