{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

handout12

# handout12 - ISyE8843A Brani Vidakovic 1 1.1 Handout 12 EM...

This preview shows pages 1–3. Sign up to view the full content.

ISyE8843A, Brani Vidakovic Handout 12 1 EM Algorithm and Mixtures. 1.1 Introduction The Expectation-Maximization (EM) iterative algorithm is a broadly applicable statistical technique for maximizing complex likelihoods and handling the incomplete data problem. At each iteration step of the algorithm, two steps are performed: (i) E-Step consisting of projecting an appropriate functional containing the augmented data on the space of the original, incomplete data, and (ii) M-Step consisting of maximizing the functional. The name EM algorithm was coined by Dempster, Laird, and Rubin in their fundamental paper [1], often referred to as DLR paper. But if one comes up with smart idea, one may be sure that other smart guys in history thought about it. The EM algorithm relates to MCMC as a forerunner by its data augmentation step that replaces sim- ulation by maximization. Newcomb [7] was interested in estimating the mixtures of normals in 1886. McKendrick [5] and Healy and Westmacott [3] proposed iterative methods that, in fact, are examples of the EM algorithm. Dozens of papers proposing various applications of EM appeared before the DLR paper in 1997. However, the DLR paper was the first to unify and organize the approach. 1.2 What is EM? Let Y be a random vector corresponding to the observed data y and having a postulated pdf as f ( y, ψ ) , where ψ = ( ψ 1 , . . . , ψ d ) is a vector of unknown parameters. Let x be a vector of augmented (so called complete) data, and let z be the additional data, x = [ y, z ] . Denote by g c ( x, ψ ) the pdf of the random vector corresponding to the complete data set x . The log- likelihood for ψ , if x were fully observed, would be log L c ( ψ ) = log g c ( x, ψ ) . The incomplete data vector y comes from the “incomplete” sample space Y . There is a 1-1 correspon- dence between the complete sample space X and the incomplete sample space Y . Thus, for x ∈ X , one can uniquely find the “incomplete” y = y ( x ) ∈ Y . Also, the incomplete pdf could be found by properly integrating out the complete pdf, g ( y, ψ ) = Z X ( y ) g c ( x, ψ ) dx, where X ( y ) is the subset of X constrained by the relation y = y ( x ) . Let ψ (0) be some initial value for ψ . At the k -th step the EM algorithm one performs the following two steps: E-Step. Calculate Q ( ψ, ψ ( k ) ) = E ψ ( k ) { log L c ( ψ ) | y } . M-Step. Choose any value ψ ( k +1) that maximizes Q ( ψ, ψ ( k ) ) , i.e., ( ψ ) Q ( ψ ( k +1) , ψ ( k ) ) Q ( ψ, ψ ( k ) ) . 1

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
The E and M steps are alternated until the difference L ( ψ ( k +1) ) - L ( ψ ( k ) ) becomes small in absolute value. Next we illustrate the EM algorithm on a famous example first considered by Fisher and Balmukand [2]. It is also discussed in Rao’s monograph [8] and Mclachlan and Krishnan [6]. 1.2.1 Fisher’s Example Here is the background. This description follows a superb 2002 lecture by Terry Speed of UC at Berkeley. In modern terminology, one has two linked bi-allelic loci, A and B say, with alleles A and a , and B and b , respectively, where A is dominant over a and B is dominant over b . A double heterozygote AaBb will produce gametes of four types: AB , Ab , aB and ab. Since the loci are linked, the types AB and ab will
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern