1
Maximum-likelihood
Recall the definition of the maximum-likelihood estimation problem. We have a density function
that is governed by the set of parameters
(e.g.,
might be a set of Gaussians and
could
be the means and covariances). We also have a data set of size
, supposedly drawn from this
distribution, i.e.,
. That is, we assume that these data vectors are independent and
identically distributed (i.i.d.) with distribution
. Therefore, the resulting density for the samples is
This function
is called the likelihood of the parameters given the data, or just the likelihood
function. The likelihood is thought of as a function of the parameters
where the data
is fixed.
In the maximum likelihood problem, our goal is to find the
that maximizes
. That is, we wish
to find
where
argmax
Often we maximize
instead because it is analytically easier.
Depending on the form of
this problem can be easy or hard. For example, if
is simply a single Gaussian distribution where
, then we can set the derivative of
to zero, and solve directly for
and
(this, in fact, results in the standard formulas
for the mean and variance of a data set). For many problems, however, it is not possible to find such
analytical expressions, and we must resort to more elaborate techniques.
2
Basic EM
The EM algorithm is one such elaborate technique. The EM algorithm [ALR77, RW84, GJ95, JJ94,
Bis95, Wu83] is a general method of finding the maximum-likelihood estimate of the parameters of
an underlying distribution from a given data set when the data is incomplete or has missing values.
There are two main applications of the EM algorithm. The first occurs when the data indeed
has missing values, due to problems with or limitations of the observation process. The second
occurs when optimizing the likelihood function is analytically intractable but when the likelihood
function can be simplified by assuming the existence of and values for additional but
missing
(or
hidden
) parameters. The later application is more common in the computational pattern recognition