Stat841f09 - Wiki Course Notes

# In many cases this simple model is sufficient to

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: lassification - for example, the Z- Score credit risk model, designed by Edward Altman in 1968, which is essentially a weighted LDA, revisited in 2000 (http://pages.stern.nyu.edu/~ealtman/Zscores.pdf) , has shown an 85- 90% success rate predicting bankruptcy, and is still in use today. Purpos e 1 feature selection 2 which classification rule best seperate the classes De finition To perform LDA (http://en.wikipedia.org/wiki/Linear_discriminant_analysis) we make two assumptions. The clusters belonging to all classes each follow a multivariate normal distribution. where is a class conditional density Simplification Assumption: Each cluster has the same covariance matrix equal to the covariance of . We wish to solve for the decision boundary (http://en.wikipedia.org/wiki/Decision_boundary) where the error rates for classifying a point are equal, where one side of the boundary gives a lower error rate for one class and the other side gives a lower error rate for the other class. So we solve for all the pairwise combinations of classes. wikicour senote.com/w/index.php?title= Stat841&pr intable= yes 7/74 10/09/2013 Stat841 - Wiki Cour se Notes using Bayes' Theorem by canceling denominators Since both Σ are equal based on the assumptions specific to LDA. taking the log of both sides. by expanding out after canceling out like terms and factoring. We can see that this is a linear function in with general form . Actually, this linear log function shows that the decision boundary between class and class , i.e. Given any pair of classes, decision boundaries are always linear. In dimensions, we separate regions by hyperplanes. In the special case where the number of samples from each class are equal ( , is linear in ), the boundary surface or line lies halfway between . and Limitation LDA implicitly assumes Gaussian distribution of data. LDA implicitly assumes that the mean is the discriminating factor, not variance. LDA may overfit the data. QDA The concept uses a same idea as LDA of finding a boundary wher...
View Full Document

## This document was uploaded on 03/07/2014.

Ask a homework question - tutors are online