This preview shows page 1. Sign up to view the full content.
Unformatted text preview: dient descent (cont. II)
Say we’re interested in ∂/∂aij . Lets expand the numerator above: numerator = ∂
∂aij p(x1:T , q1:T λ) =
q1:T ∂
∂aij p(xt qt )p(qt qt−1 )
q1:T t (4.15)
∆ Deﬁne Tij (q1:T ) = {t : qt−1 = i, qt = j } in the following: numerator = ∂
∂aij p(xt qt )
q1:T t aij
t∈Tij (q1:T ) p(qt qt−1 )
t∈Tij (q1:T ) (4.16) Prof. Jeﬀ Bilmes EE596A/Winter 2013/DGMs – Lecture 4  Jan 23rd, 2013 page 423 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ HMM  learning with gradient descent
We get
p(xt qt ) num =
q1:T t T (q1:T )−1 q1:T t q1:T 1
=
aij t p(x1:T , q1:T ) t 1
aij Tij (q1:T )
=
aij
q p(x1:T , q1:T ) 1:T q1:T 1
aij p(qt qt−1 )
t∈Tij (q1:T ) p(xt qt )p(qt qt−1 ) = = p(qt qt−1 )
t∈Tij (q1:T ) p(xt qt )Tij (q1:T )aij ij = = ∂ Tij (q1:T )
a
∂aij ij Tij (q1:T )
aij 1{qt−1 = i, qt = j }
t t Prof. Jeﬀ Bilmes p(x1:T , q1:T )1{qt−1 = i, qt = j }
q1:T p(x1:T , qt−1 = i, qt = j )
EE596A/Winter 2013/DGMs – Lecture 4  Jan 23rd, 2013 page 424 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ HMM  learning with gradient descent ∂
f (λ) =
∂aij ∂
∂aij q1:T p(x1:T , q1:T λ) p(x1:T λ) = 1
aij t pλ (x1:T , qt−1 = i, qt = j )
pλ (x1:T λ)
(4.17) 1
=
aij pλ (qt−1 = i, qt = j x1:T ) (4.18) t This means that, like in EM, for gradient descent learning, we also
need for all t the queries p(Qt = j, Qt−1 = ix1:T ) from the HMM.
A similar analysis shows that we also need ∀t p(Qt = ix1:T ). These
are also needed when performing discriminative training. So clique
posteriors are fundamental, we must have a procedure that produces
them quickly.
Prof. Jeﬀ Bilmes EE596A/Winter 2013/DGMs – Lecture 4  Jan 23rd, 2013 page 425 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ Main point: inference is important Main point of last few slides: probabilistic “inference” (computing
probabilities of certain sets of random variables) is needed by many
operations, including:
Decision Making (e.g., random variables that have the highest
probability) Prof. Jeﬀ Bilmes EE596A/Winter 2013/DGMs – Lecture 4  Jan 23rd, 2013 page 426 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ Main point: inference is important Main point of last few slides: probabilistic “inference” (computing
probabilities of certain sets of random variables) is needed by many
operations, including:
Decision Making (e.g., random variables that have the highest
probability)
Analysis and debugging Prof. Jeﬀ Bilmes EE596A/Winter 2013/DGMs – Lecture 4  Jan 23rd, 2013 page 426 (of 239) HMMs HMMs as GMs Other HMM queries What HMMs can do MPE Summ Main point: inference is important Main point of last few slides: probabilistic “inference” (computing
probabilities of certain sets of random variables) is needed by many
operations, including:
Decision Making (e.g., random variables that have the highest
probability)
Analysis and debugging
Learning the parameters of the model in response to training data. Prof. Jeﬀ Bilmes EE596A/Winter 2...
View
Full
Document
This document was uploaded on 04/05/2014.
 Winter '14

Click to edit the document details