EM-Tutorial

EM-Tutorial - A Gentle Tutorial of the EM Algorithm and its...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes (bilmes@cs.berkeley.edu) International Computer Science Institute Berkeley CA, 94704 and Computer Science Division Department of Electrical Engineering and Computer Science U.C. Berkeley TR-97-021 April 1998 Abstract We describe the maximum-likelihood parameter estimation problem and how the Expectation- Maximization (EM) algorithm can be used for its solution. We Frst describe the abstract form of the EM algorithm as it is often given in the literature. We then develop the EM pa- rameter estimation procedure for two applications: 1) Fnding the parameters of a mixture of Gaussian densities, and 2) Fnding the parameters of a hidden Markov model (HMM) (i.e., the Baum-Welch algorithm) for both discrete and Gaussian mixture observation models. We derive the update equations in fairly explicit detail but we do not prove any conver- gence properties. We try to emphasize intuition rather than mathematical rigor.
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
ii
Background image of page 2
1 Maximum-likelihood Recall the defnition oF the maximum-likelihood estimation problem. We have a density Function that is governed by the set oF parameters (e.g., might be a set oF Gaussians and could be the means and covariances). We also have a data set oF size , supposedly drawn From this distribution, i.e., . That is, we assume that these data vectors are independent and identically distributed (i.i.d.) with distribution . ThereFore, the resulting density For the samples is This Function is called the likelihood oF the parameters given the data, or just the likelihood Function. The likelihood is thought oF as a Function oF the parameters where the data is fxed. In the maximum likelihood problem, our goal is to fnd the that maximizes . That is, we wish to fnd where argmax OFten we maximize instead because it is analytically easier. Depending on the Form oF this problem can be easy or hard. ±or example, iF is simply a single Gaussian distribution where , then we can set the derivative oF to zero, and solve directly For and (this, in Fact, results in the standard Formulas For the mean and variance oF a data set). ±or many problems, however, it is not possible to fnd such analytical expressions, and we must resort to more elaborate techniques. 2 Basic EM The EM algorithm is one such elaborate technique. The EM algorithm [ALR77, RW84, GJ95, JJ94, Bis95, Wu83] is a general method oF fnding the maximum-likelihood estimate oF the parameters oF an underlying distribution From a given data set when the data is incomplete or has missing values. There are two main applications oF the EM algorithm. The frst occurs when the data indeed has missing values, due to problems with or limitations oF the observation process. The second occurs when optimizing the likelihood Function is analytically intractable but when the likelihood Function can be simplifed by assuming the existence oF and values For additional but
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 15

EM-Tutorial - A Gentle Tutorial of the EM Algorithm and its...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online