Ch9.3-LatentVariableViewofEM

Ch9.3-LatentVariableViewofEM - Machine Learning ! ! ! ! !...

Info iconThis preview shows pages 1–8. Sign up to view the full content.

View Full Document Right Arrow Icon
Machine Learning Srihari 1 Latent Variable View of EM Sargur Srihari srihari@cedar.buffalo.edu
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Machine Learning Srihari 2 Main Idea of EM • Goal of EM is: – find maximum likelihood solutions for models having latent variables • In case of mixture models – We have a complex distribution of observed variables x – We wish to estimate its parameters • Introduce latent variables z so that – the joint distribution p(x,z) of observed and latent variables is more tractable (since we know forms of components) – Complicated distribution is formed from simpler components • The original distribution is obtained by marginalizing the joint distribution
Background image of page 2
Machine Learning Srihari 3 Alternative View of EM • This view recognizes key role of latent variables • Observed data Latent Variables – where n th row represents x n T =[x n1 x n2 x nD ] – with corresponding row z n T =[z n1 z n2 z nK ] • Goal of EM algorithm is to find maximum likelihood solutions for models having latent variables X = x 1 x 2 x n " # $ $ $ $ % & ' ' ' ' Z = z 1 z 2 z n " # $ $ $ $ % & ' ' ' '
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Machine Learning Srihari 4 Likelihood Function involving Latent Variables • Joint likelihood function is p(X,Z| θ ) where is the set of all model parameters – E.g., means, covariances, responsibilities • Marginal likelihood function of observed data – From sum rule • Log likelihood function is p ( X | ) = p ( X , Z | ) Z ln p ( X | ) = ln p (X, Z | ) Z $ % & ' ( )
Background image of page 4
Machine Learning Srihari 5 Latent Variables in EM • Log likelihood function is • Key Observation: – Summation over latent variables appears inside logarithm • Even if joint distribution p(X,Z| θ ) belongs to exponential family the marginal distribution p(X| ) does not • Results in complicated expressions for maximum likelihood solution Summation inside brackets due to marginalization Not due to log-likelihood ln p ( X | ) = ln p (X, Z | ) Z $ % & ' ( )
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Machine Learning Srihari Complete and Incomplete Data Sets Complete Data {X,Z} For each observation in X we know corresponding value of latent variable Z Log-likelihood has the form ln p(X,Z| θ ) – maximization is straightforward Incomplete Data {X} Actual data set Log likelihood function is Maximization is difficult – summations inside logarithm 6 ln p ( X | θ ) = ln p (X, Z | ) Z $ % & ' ( )
Background image of page 6
Machine Learning Srihari 7 Expectation of log-likelihood • Since we don t have the complete data set {X,Z} we evaluate the expectation of the log-likelihood ln p(X,Z| θ ) • Since we are given X , our state of knowledge of Z is given only by the posterior distribution of the latent variables p (Z|X, θ )
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 8
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 26

Ch9.3-LatentVariableViewofEM - Machine Learning ! ! ! ! !...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online