{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

EM-Tutorial

# EM-Tutorial - A Gentle Tutorial of the EM Algorithm and its...

This preview shows pages 1–4. Sign up to view the full content.

A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes ([email protected]) International Computer Science Institute Berkeley CA, 94704 and Computer Science Division Department of Electrical Engineering and Computer Science U.C. Berkeley TR-97-021 April 1998 Abstract We describe the maximum-likelihood parameter estimation problem and how the Expectation- Maximization (EM) algorithm can be used for its solution. We first describe the abstract form of the EM algorithm as it is often given in the literature. We then develop the EM pa- rameter estimation procedure for two applications: 1) finding the parameters of a mixture of Gaussian densities, and 2) finding the parameters of a hidden Markov model (HMM) (i.e., the Baum-Welch algorithm) for both discrete and Gaussian mixture observation models. We derive the update equations in fairly explicit detail but we do not prove any conver- gence properties. We try to emphasize intuition rather than mathematical rigor.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
ii
1 Maximum-likelihood Recall the definition of the maximum-likelihood estimation problem. We have a density function that is governed by the set of parameters (e.g., might be a set of Gaussians and could be the means and covariances). We also have a data set of size , supposedly drawn from this distribution, i.e., . That is, we assume that these data vectors are independent and identically distributed (i.i.d.) with distribution . Therefore, the resulting density for the samples is This function is called the likelihood of the parameters given the data, or just the likelihood function. The likelihood is thought of as a function of the parameters where the data is fixed. In the maximum likelihood problem, our goal is to find the that maximizes . That is, we wish to find where argmax Often we maximize instead because it is analytically easier. Depending on the form of this problem can be easy or hard. For example, if is simply a single Gaussian distribution where , then we can set the derivative of to zero, and solve directly for and (this, in fact, results in the standard formulas for the mean and variance of a data set). For many problems, however, it is not possible to find such analytical expressions, and we must resort to more elaborate techniques. 2 Basic EM The EM algorithm is one such elaborate technique. The EM algorithm [ALR77, RW84, GJ95, JJ94, Bis95, Wu83] is a general method of finding the maximum-likelihood estimate of the parameters of an underlying distribution from a given data set when the data is incomplete or has missing values. There are two main applications of the EM algorithm. The first occurs when the data indeed has missing values, due to problems with or limitations of the observation process. The second occurs when optimizing the likelihood function is analytically intractable but when the likelihood function can be simplified by assuming the existence of and values for additional but missing (or hidden ) parameters. The later application is more common in the computational pattern recognition

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 15

EM-Tutorial - A Gentle Tutorial of the EM Algorithm and its...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online