Stat841f09 - Wiki Course Notes

# Is called the prior probability

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: or_probability) . This is a probability distribution that represents what we know (or believe we know) about a population. is the sum with respect to all classes. Approache s Representing the optimal method, Bayes classifier cannot be used in the most practical situations though, since usually the prior probability is unknown. Fortunately, other methods of classification have been evolved. These methods fall into three general categories. 1 Empirical Risk Minimization (http://en.wikipedia.org/wiki/Supervised_learning) :Choose a set fo classifier 2 Regression:Find an estimate of the function and find , minimize some estimate of . and deifne 3 Density estimation (http://en.wikipedia.org/wiki/Density_estimation) , estimate and Note: The third approach, in this form, is not popular because density estimation doesn't work very well with dimension greater than 2. However this approach is the simplest and we can assume a parametric model for the densities. Linear Discriminate Analysis and Quadratic Discriminate Analysis are examples of the third approach, density estimation. LDA M otivation The Bayes classifier is optimal. Unfortunately, the prior and conditional density of most data is not known. Some estimation of these should be made if we want to classify some data. The simplest way to achieve this is to assume that all the class densities are approximately a multivariate normal distribution (http://en.wikipedia.org/wiki/Multivariate_normal_distribution) , find the parameters of each such distribution, and use them to calculate the conditional density and prior for unknown points, thus approximating the Bayesian classifier to choose the most likely class. In addition, if the covariance of each class density is assumed to be the same, the number of unknown parameters is reduced and the model is easy to fit and use, as seen later. His tory The name Linear Discriminant Analysis comes from the fact that these simplifications produce a linear model, which is used to discriminate between classes. In many cases, this simple model is sufficient to provide a near optimal c...
View Full Document

Ask a homework question - tutors are online