Unformatted text preview: arametric estimate of the same density as shown in Figure 2.3.
41 Paul A. Viola CHAPTER 2. PROBABILITY AND ENTROPY
Parzen Fit 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 -4 -3 -2 -1 0 1 2 3 4 Figure 2.4: Three views of a Gaussian density with a mean of 0:0 and a variance 1:0: First
a sample of 100 points drawn from the density. Each point is is represented a vertical black
line. Second the density of the true Gaussian. Third the Parzen density estimate constructed
from the sample. The window functions are Gaussians with variance 0:35.
Intuitively, the Parzen density estimator computes a local, or windowed, average of the
sample. Looking back to 2.41, notice that if R is symmetrical about the origin we can
view the window function as being centered on the query point, x, rather than at the data
points. Viewed in this light, the density estimate at a query point is a weighted sum over
the sample, where the weighting is determined by the window function. The most common
window functions are unimodal, symmetric about the origin, and fall o quickly to zero. In
e ect, the window function de nes a region centered on x in which sample points contribute
to the density estimate. Points that fall outside of this window do not contribute. The
density estimate at x is the ratio of the number of weighted sample points within the window
divided by the total number of sample points, Na. Getting a reliable estimate of this ratio
involves having a reasonable number of points fall into the window around the query point.
The number of points that we expect to fall into this window is a function both of the size
of the sample and the size of the window. As the number of points that fall into a window
decreases, the variance of the Parzen density estimate increases. We will analyze the variance
of the Parzen estimate later in the chapter.
The balance of computation required by Parzen window density estimation is qualitatively
very di erent from parametric schemes. Constructing a parametric model involves a lengthy
search through parameter space that takes more time for larger samples. Constructing a
42 2.4. MODELING DENSITIES AI-TR 1548 0.7
9 0.6 0.5 0.4 0.3 0.2 0.1 0 -4 -3 -2 -1 0 1 2 3 4 Figure 2.5: The Parzen density estimates for ten di erent samples of 100 points drawn from
the same Gaussian density.
Parzen model is cheap. One need only memorize the sample. Evaluating a parametric model
is usually e cient. Once the parameters are known the number of operations required is
usually very small and does not grow with the size of the sample. Evaluating P x; a is more
expensive; requiring time proportional to the size of the sample. The overall computational
complexity of either technique is a function of how they are used.
Though the Parzen estimate is a mixture model, it is not the maximum likelihood mixture
model. Unlike the Parzen estimate, the maximum likelihood model is not constrained to place
one Gaussians at each of the sample points. There is however an asymptotic proof of Parzen
convergence that relies on the law of large numbers. The Parzen estimate can be written as
a sample mean:
1 X Rx , x = E Rx , X :
P x0; a = N
a xa 2a In the limit this equals the true expectation which in turn is a convolution
lim P x; a = E Rx , X
Rx , x0px0dx0
= R px : Na !1 2.42
2.44 So P x; a converges to px if and only if px = R px. There are two distinct
conditions under which equality holds. The rst is that R tends toward the delta function
43 Paul A. Viola CHAPTER 2. PROBABILITY AND ENTROPY
0.6 0.5 0.4 0.3 0.2 0.1 0 -4 -3 -2 -1 0 1 2 3 4 Figure 2.6: Three views of a density constructed from a combination of two Gaussians. The
Gaussians have variance 0:3 and means of 2:0 and ,2:0 respectively. As before the sample
contains 100 points. The Parzen estimate is constructed with Gaussians of variance 0:20.
when the sample size approaches in nity. The second occurs when convolution by R does not
change px. In theory this could be achieved when px has bounded frequency content and
R is a perfect low pass lter. In practice approximate equality holds whenever px has low
frequency content and R is primarily a low pass lter, for example when px is a smooth
function and R is a Gaussian. Finally, whenever px = R px the Parzen estimate,
P x; a, is an unbiased estimator of px.
There are other conditions under which the Parzen estimate will converge to the correct
density estimate. This proof assumes that the samples are corrupted by measurement noise
of a known density. Instead of X , a corrupted random variable, X = X + , is observed. If were known the probability of X would be, f~
pX = xjX = x; = x , x , :
Without knowledge of we must integrate over all its possible values,
pX = xjX = x = ,1 pX = xjX = x; 0p 0d0
x , x , p d0
= p x , x :
2.47 2.4. MODELING DENSITIES AI-TR 1548 f
To compute pX we must integrat...
View Full Document
This note was uploaded on 02/10/2010 for the course TBE 2300 taught by Professor Cudeback during the Spring '10 term at Webber.
- Spring '10
- The Land