parameter

Parameter - Parameter Estimation Parameter Estimation PR ANN& ML 2 Notational Convention x Probabilities b Mass(discrete function capital

Info iconThis preview shows pages 1–14. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Parameter Estimation Parameter Estimation PR , ANN, & ML 2 Notational Convention x Probabilities b Mass (discrete) function: capital letters b Density (continuous) function: small letters x Vector vs. scalar b Scalar: plain b Vector: bold b 2D: small b Higher dimension: capital x Notes in a continuous state of fluctuation until a topic is finished (many updates) PR , ANN, & ML 3 Parameter Estimation x Optimal classifier maximizes b a prior probability b class-conditional density x Assumption b no correlation b time independent statistics ) ( ) ( ) | ( ) | ( x x x p P p p i i i ϖ = PR , ANN, & ML 4 Popular Approaches x Parametric : assume a certain parametric form for p( x |w i ) and estimate the parameters x Nonparametric : does not assume a parametric form for p( x |w i ) and estimate the density profile directly x Boundary : estimate the separation hyperplane (hypersurface) between p( x |w i ) and p( x |w j ) PR , ANN, & ML 5 a prior probability x Given the numbers of occurrence: b if number of samples are large enough b the selection process is not biased b Caveat : sampling may be biased k i M n P M n n n n i i k i i k k , , 1 ) ( ) , ( , ), , ( ), , ( 1 2 2 1 1 L L = = = ∑ = ϖ PR , ANN, & ML 6 Class conditional density x More complicated (not a single number, but a distribution ) b assume a certain form b estimate the parameters x What form should we assume? b Many, but in this course b We use almost exclusively Gaussian PR , ANN, & ML 7 x Gaussian (or Normal) Scalar case x Vector case x Unknowns b class mean and variance 2 2 ) ( 2 1 2 1 ) , ( ) | ( i i u x i i i i e N x p σ π μ ϖ--= = )] ( ) [( 2 1 2 / 1 1 | | 2 1 ) ( ) | ( i T i e N p i d i i i u x Σ u x Σ Σ , μ x r r r r----= = Gaussian Distribution PR , ANN, & ML 8 feature 2 2 ) ( 2 1 2 1 σ π u x e--μ population 2 PR , ANN, & ML 9 Why Gaussian (Normal)? x Central limit theorem predicts normal distribution from IID experiments x In reality b There are only two numbers in the scalar case (mean and variance) to estimate, (or d + d(d+1)/2 in d-dimensions) b Nice mathematical properties (e.g., Fourier transform of a Gaussian is a Gaussian. Products and summation of Gaussian remain Gaussian, Any linear transform of a Gaussian is a Gaussian) PR , ANN, & ML 10 Projection Transformation x In particular, a whitening transform can diagonalize the covariance matrix PR , ANN, & ML 11 Parameter Estimation x Maximum likelihood estimator b Parameters have fixed but unknown values x Bayesian estimator b parameters as random variables with know a prior distributions b Bayesian estimator allows us to change the a priori distribution by incorporating measurements to sharpen the profile PR , ANN, & ML 12 Graphically x MLE x Bayesian parameters likelihood θ PR , ANN, & ML 13 Maximum Likelihood Estimator x Given b n labeled samples (observations) b an assumed distribution of e parameters b samples are drawn independently from x Find b parameter that best explains the observations } , , , { 2 1 n...
View Full Document

This note was uploaded on 08/06/2008 for the course CS 290I taught by Professor Wang during the Spring '07 term at UCSB.

Page1 / 42

Parameter - Parameter Estimation Parameter Estimation PR ANN& ML 2 Notational Convention x Probabilities b Mass(discrete function capital

This preview shows document pages 1 - 14. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online