Ch9.3-LatentVariableViewofEM

Ch9.3-LatentVariableViewofEM - Machine Learning ! ! ! ! !...

This preview shows pages 1–8. Sign up to view the full content.

Machine Learning Srihari 1 Latent Variable View of EM Sargur Srihari srihari@cedar.buffalo.edu

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Machine Learning Srihari 2 Main Idea of EM • Goal of EM is: – find maximum likelihood solutions for models having latent variables • In case of mixture models – We have a complex distribution of observed variables x – We wish to estimate its parameters • Introduce latent variables z so that – the joint distribution p(x,z) of observed and latent variables is more tractable (since we know forms of components) – Complicated distribution is formed from simpler components • The original distribution is obtained by marginalizing the joint distribution
Machine Learning Srihari 3 Alternative View of EM • This view recognizes key role of latent variables • Observed data Latent Variables – where n th row represents x n T =[x n1 x n2 x nD ] – with corresponding row z n T =[z n1 z n2 z nK ] • Goal of EM algorithm is to find maximum likelihood solutions for models having latent variables X = x 1 x 2 x n " # \$ \$ \$ \$ % & ' ' ' ' Z = z 1 z 2 z n " # \$ \$ \$ \$ % & ' ' ' '

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Machine Learning Srihari 4 Likelihood Function involving Latent Variables • Joint likelihood function is p(X,Z| θ ) where is the set of all model parameters – E.g., means, covariances, responsibilities • Marginal likelihood function of observed data – From sum rule • Log likelihood function is p ( X | ) = p ( X , Z | ) Z ln p ( X | ) = ln p (X, Z | ) Z \$ % & ' ( )
Machine Learning Srihari 5 Latent Variables in EM • Log likelihood function is • Key Observation: – Summation over latent variables appears inside logarithm • Even if joint distribution p(X,Z| θ ) belongs to exponential family the marginal distribution p(X| ) does not • Results in complicated expressions for maximum likelihood solution Summation inside brackets due to marginalization Not due to log-likelihood ln p ( X | ) = ln p (X, Z | ) Z \$ % & ' ( )

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Machine Learning Srihari Complete and Incomplete Data Sets Complete Data {X,Z} For each observation in X we know corresponding value of latent variable Z Log-likelihood has the form ln p(X,Z| θ ) – maximization is straightforward Incomplete Data {X} Actual data set Log likelihood function is Maximization is difficult – summations inside logarithm 6 ln p ( X | θ ) = ln p (X, Z | ) Z \$ % & ' ( )
Machine Learning Srihari 7 Expectation of log-likelihood • Since we don t have the complete data set {X,Z} we evaluate the expectation of the log-likelihood ln p(X,Z| θ ) • Since we are given X , our state of knowledge of Z is given only by the posterior distribution of the latent variables p (Z|X, θ )

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

Ch9.3-LatentVariableViewofEM - Machine Learning ! ! ! ! !...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online