handout12 - ISyE8843A, Brani Vidakovic Handout 12 1 EM...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: ISyE8843A, Brani Vidakovic Handout 12 1 EM Algorithm and Mixtures. 1.1 Introduction The Expectation-Maximization (EM) iterative algorithm is a broadly applicable statistical technique for maximizing complex likelihoods and handling the incomplete data problem. At each iteration step of the algorithm, two steps are performed: (i) E-Step consisting of projecting an appropriate functional containing the augmented data on the space of the original, incomplete data, and (ii) M-Step consisting of maximizing the functional. The name EM algorithm was coined by Dempster, Laird, and Rubin in their fundamental paper [1], often referred to as DLR paper. But if one comes up with smart idea, one may be sure that other smart guys in history thought about it. The EM algorithm relates to MCMC as a forerunner by its data augmentation step that replaces sim- ulation by maximization. Newcomb [7] was interested in estimating the mixtures of normals in 1886. McKendrick [5] and Healy and Westmacott [3] proposed iterative methods that, in fact, are examples of the EM algorithm. Dozens of papers proposing various applications of EM appeared before the DLR paper in 1997. However, the DLR paper was the first to unify and organize the approach. 1.2 What is EM? Let Y be a random vector corresponding to the observed data y and having a postulated pdf as f ( y, ) , where = ( 1 ,..., d ) is a vector of unknown parameters. Let x be a vector of augmented (so called complete) data, and let z be the additional data, x = [ y,z ] . Denote by g c ( x, ) the pdf of the random vector corresponding to the complete data set x . The log- likelihood for , if x were fully observed, would be log L c ( ) = log g c ( x, ) . The incomplete data vector y comes from the incomplete sample space Y . There is a 1-1 correspon- dence between the complete sample space X and the incomplete sample space Y . Thus, for x X , one can uniquely find the incomplete y = y ( x ) Y . Also, the incomplete pdf could be found by properly integrating out the complete pdf, g ( y, ) = Z X ( y ) g c ( x, ) dx, where X ( y ) is the subset of X constrained by the relation y = y ( x ) . Let (0) be some initial value for . At the k-th step the EM algorithm one performs the following two steps: E-Step. Calculate Q ( , ( k ) ) = E ( k ) { log L c ( ) | y } . M-Step. Choose any value ( k +1) that maximizes Q ( , ( k ) ) , i.e., ( ) Q ( ( k +1) , ( k ) ) Q ( , ( k ) ) . 1 The E and M steps are alternated until the difference L ( ( k +1) )- L ( ( k ) ) becomes small in absolute value. Next we illustrate the EM algorithm on a famous example first considered by Fisher and Balmukand [2]....
View Full Document

This note was uploaded on 10/23/2011 for the course ISYE 8843 taught by Professor Vidakovic during the Spring '11 term at Georgia Institute of Technology.

Page1 / 19

handout12 - ISyE8843A, Brani Vidakovic Handout 12 1 EM...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online