15 dene tij q1t t qt1 i qt j in the following

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: dient descent (cont. II) Say we’re interested in ∂/∂aij . Lets expand the numerator above: numerator = ∂ ∂aij p(x1:T , q1:T |λ) = q1:T ∂ ∂aij p(xt |qt )p(qt |qt−1 ) q1:T t (4.15) ∆ Define Tij (q1:T ) = {t : qt−1 = i, qt = j } in the following: numerator = ∂ ∂aij p(xt |qt ) q1:T t aij t∈Tij (q1:T ) p(qt |qt−1 ) t∈Tij (q1:T ) (4.16) Prof. Jeff Bilmes EE596A/Winter 2013/DGMs – Lecture 4 - Jan 23rd, 2013 page 4-23 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ HMM - learning with gradient descent We get p(xt |qt ) num = q1:T t |T (q1:T )|−1 q1:T t q1:T 1 = aij t p(x1:T , q1:T ) t 1 aij |Tij (q1:T )| = aij q p(x1:T , q1:T ) 1:T q1:T 1 aij p(qt |qt−1 ) t∈Tij (q1:T ) p(xt |qt )p(qt |qt−1 ) = = p(qt |qt−1 ) t∈Tij (q1:T ) p(xt |qt )|Tij (q1:T )|aij ij = = ∂ |Tij (q1:T )| a ∂aij ij |Tij (q1:T )| aij 1{qt−1 = i, qt = j } t t Prof. Jeff Bilmes p(x1:T , q1:T )1{qt−1 = i, qt = j } q1:T p(x1:T , qt−1 = i, qt = j ) EE596A/Winter 2013/DGMs – Lecture 4 - Jan 23rd, 2013 page 4-24 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ HMM - learning with gradient descent ∂ f (λ) = ∂aij ∂ ∂aij q1:T p(x1:T , q1:T |λ) p(x1:T |λ) = 1 aij t pλ (x1:T , qt−1 = i, qt = j ) pλ (x1:T |λ) (4.17) 1 = aij pλ (qt−1 = i, qt = j |x1:T ) (4.18) t This means that, like in EM, for gradient descent learning, we also need for all t the queries p(Qt = j, Qt−1 = i|x1:T ) from the HMM. A similar analysis shows that we also need ∀t p(Qt = i|x1:T ). These are also needed when performing discriminative training. So clique posteriors are fundamental, we must have a procedure that produces them quickly. Prof. Jeff Bilmes EE596A/Winter 2013/DGMs – Lecture 4 - Jan 23rd, 2013 page 4-25 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ Main point: inference is important Main point of last few slides: probabilistic “inference” (computing probabilities of certain sets of random variables) is needed by many operations, including: Decision Making (e.g., random variables that have the highest probability) Prof. Jeff Bilmes EE596A/Winter 2013/DGMs – Lecture 4 - Jan 23rd, 2013 page 4-26 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ Main point: inference is important Main point of last few slides: probabilistic “inference” (computing probabilities of certain sets of random variables) is needed by many operations, including: Decision Making (e.g., random variables that have the highest probability) Analysis and debugging Prof. Jeff Bilmes EE596A/Winter 2013/DGMs – Lecture 4 - Jan 23rd, 2013 page 4-26 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ Main point: inference is important Main point of last few slides: probabilistic “inference” (computing probabilities of certain sets of random variables) is needed by many operations, including: Decision Making (e.g., random variables that have the highest probability) Analysis and debugging Learning the parameters of the model in response to training data. Prof. Jeff Bilmes EE596A/Winter 2...
View Full Document

This document was uploaded on 04/05/2014.

Ask a homework question - tutors are online