The probability of the corrupted random variable can

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: e over X . We show that the integral is approximated by the Parzen estimator, Z1 f f ef f pX = x = ,1 pX = xjX = X 0pX X 0dX 0 2.48 Z1 f ef f = p x , X 0pX X 0dX 0 2.49 ,1 f = EX p x , X  2.50 e ~ 2.51  Ea p x , xa 1 X p x , x  ; 2.52 =N 0 ~a a xa 2a ~ f where a is a sample of X . The probability of the uncorrupted random variable X is apf proximated by the Parzen estimate constructed from the samples of X where the smoothing function is the density function of the noise. The probability of the corrupted random variable can be derived from a very similar argument, 1 X p  p x , x  : f~ pX = x  N 0 ~a a xa 2a ~ f The probability of a noise corrupted random variable X is approximated by the Parzen estimate using the smoothing function p  p x. This result is independent of the density of X . Often is Gaussian noise, a very common assumption that we will return to in our discussions of entropy. The smoothing function is then a Gaussian density that has twice the standard deviation of . Finding the Best Smoothing Functions As we have seen, when a priori information about the density is available Parzen estimation will converge to the correct density. Moreover, when we know either that the density is smooth or that it has been perturbed by noise it is possible to nd the correct smoothing function. In the absence of a priori information, the quality of the Parzen estimate is dependent on the variance  of the smoothing functions. Figures 2.7 and 2.8 display the dependence of the density estimate on . Each shows the Parzen estimates computed from a 100 point sample as  is changed. Notice that the actual density function that results is very dependent on the variance. The qualitative nature of this dependence varies across the range of variances 45 Paul A. Viola CHAPTER 2. PROBABILITY AND ENTROPY 1 0.005 0.02 0.08 0.32 1.28 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 -4 -3 -2 -1 0 1 2 3 4 Figure 2.7: Five plots of the Parzen density estimates derived from a 100 point sample of a Gaussian. The Gaussian has variance 1:0 and mean 0:0. The di erent estimates use a di erent value for the variance of the component smoothing functions. The variances used range over a factor of 256, from 0:005 to 1:28. shown. When the variance of the smoothing function is small, less that 0.1, the resulting density changes very rapidly as variance is changed. Above 0.2 small changes in variance do not change the resulting density nearly as rapidly. Selection of the correct variance for the smoothing functions need not be a hit or miss process. Much in the same way that likelihood can be used to nd the parameters of a Gaussian to t a sample, likelihood can be used to nd the variance of the Gaussians that make up the Parzen estimate. In general it is possible to compute the best variance for each Gaussian in the Parzen density estimate separately. This process requires a great deal of time and data. Since we wish to preserve the simplicity of the Parzen estimate, a single variance will be used for all of the smoothing functions. Recall that likelihood is maximized when empirical entropy is minimized see Section 2.3.1. Since subsequent chapters will focus on empirical entropy, we will use empirical entropy to estimate the optimal variance. Figure 2.9 graphs the empirical entropy of the sample versus variance. The sample used in this graph is the same as was used to estimate the densities in Figures 2.7 and 2.8. The broad minimum in entropy at 0:25 implies that the Parzen density estimate is not critically dependent on variance. The variance need only be within a factor of ten of the optimal variance. 46 2.4. MODELING DENSITIES AI-TR 1548 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 3 2 1 0 -1 -2 0.7 0.6 0.5 0.4 0.3 0.2 0.1 -3 Figure 2.8: A parametric surface plot of the Parzen density versus variance this is the same data shown in the previous graph. The horizontal and vertical axes are the location and density respectively. Variance changes with depth in the graph. Here variance ranges from 0:80 to 0:01. The true entropy of a Gaussian with variance 1:0 is 1:419. The optimal Parzen density estimate has an empirical entropy of 1:47. This close agreement is not coincidence. It is argued in the next chapter that the true entropy of a density can be e ectively estimated from a Parzen density estimate. There is a small technical note that should not be overlooked. We must be careful whenever the same sample that is used both to construct the Parzen estimate and to estimate entropy. Recall that the most likely, or lowest entropy, density estimate for a sample is a collection of delta functions centered at each point from the sample see 2.2.1. We also know that this delta function density will have an entropy of negative in nity. The Parzen density is very similar in form to the delta function density. It too centers a function at each point from the sample. In the limit as the variance of the smoothing functions tends towards zero, each smoothing function approximates a delta function. T...
View Full Document

This note was uploaded on 02/10/2010 for the course TBE 2300 taught by Professor Cudeback during the Spring '10 term at Webber.

Ask a homework question - tutors are online