Gaussian Mixture Model and EM

Changshui Zhang Dept. of Automation Tsinghua University zcs@mail.tsinghua.edu.cn Gaussian Mixture Model and EM (Expectation Maximization) Algorithm

Reference ± 新教材电子版 ± Jeff A. Bilmes, A Gentle Tutorial of the Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models
GMM ( Gaussian Mixture Model ) ± Observed data are samples independently generated from the following probabilistic model where and = = Θ M i i i i X p X p 1 ) | ( ) | ( θα ) ,..., , ,..., ( 1 1 M M θ α = Θ = = M i i 1 1 N (| , ) ~ ( , ) j ji i px N ω θμ Σ

GMM ( Gaussian Mixture Model ) ± Likelihood Function and log Likelihood Function 1 1 (| ) ( | , ) () N c ij j j j i pX Nx P θ μω = = 11 ln ( | ) ( | , ) ( ) Nc i jj j P == ∑∑
GMM ( Gaussian Mixture Model ) 11 ln ( | ) ( | , ) ( ) Nc i jj j ij pX Nx P θ μω == ∑∑ ( | ) 0 k μ = 1 1 1 (| , ) ( ) () 0 , ) ( ) k N ik k k c i j j j P x P = = Σ Σ− = Σ

GMM ( Gaussian Mixture Model ) 1 1 1 (| , ) ( ) () 0 , ) ( ) k N ik k k c i ij j j j Nx P x P μω μ = = Σ Σ −= Σ 1 , ) ( ) , ,) , ) ( ) k k kikk c j j j P Px P ω ωμ = Σ Σ= Σ 1 1 , , N i i k N i P xx = = Σ = Σ
GMM ( Gaussian Mixture Model ) 1 1 (| , ,) , N kikk i i k N i P xx Px ωμ μ = = Σ = Σ 1 1 () ( | , , ) N kk i k k i PP x N ωω = 1 1 ( | , , )( )( ) ,,) N T i k i k i k N i x x = = Σ− Σ= Σ 3-49

Parameter estimation for GMM new k μ new k Σ () new k P ω new kk = new Σ =Σ new PP = (| , , ) kikk Px μΣ ± Initialize: ± E step: ± M step: 1 1 ( | , , ) N new i k k i x N ωω = = Σ 1 1 ,,) k N i new i N i x ωμ = = Σ = Σ 1 1 ( | ( ) ( ) k N T k i k k ik new i N i x x = = Σ− Σ= Σ
Maximum Likelihood ± Problem: The set of parameters: A data set: A density function: Likelihood function: Where the data is fixed. Task: find Θ } ,..., , { 2 1 N x x x = χ ) | ( : ) | ( Θ = Θ X p X L = Θ = N i i x p 1 ) | ( ) | ( Θ x p * Θ ) | ( max arg * X L Θ = Θ Θ

Incomplete data ± Missing values Empty items in forms Forgotten data ± Hidden variables Can not be measured and observed.
Basic EM ± An elaborate technique of finding the maximum likelihood estimate of the parameters of a distribution from a given data set when the data is incomplete or has missing values . ± Reference: A.P.Dempster, N.M.Laird, and D.B.Rubin. Maximum-likelihood from incomplete data via the em algorithm. J.Royal Statist. Soc. Ser. B., 39, 1977

± Two main applications of the EM algorithm: --The data indeed has missing values , due to problems with or limitations of the observation process --The optimizing the likelihood function is analytically intractable but when the likelihood function can be simplified by assuming the existence of and values for additional but missing ( or hidden) parameters.
± We assume that data is observed and incomplete data. ± A Complete data set: ± And a joint density function: } ,..., , { 2 1 N x x x = χ ) , ( Y X Z = ) | ( ) , | ( ) | , ( ) | ( Θ Θ = Θ = Θ X p X Y p Y X p Z p

) | , ( ) , | ( ) | ( Θ = Θ = Θ Y X p Y X L Z L ± New likelihood function: complete data likelihood ± It is a random variable since are constant and is a random variable.
Gaussian Mixture Model and EM

