{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

20-Discriminant2

# 20-Discriminant2 - Last Time:Alternative Classication...

This preview shows pages 1–7. Sign up to view the full content.

. . . . . . Last Time:Alternative Classifcation Methods I Rule Based. I Instance Based Methods and Nearest Neighbors (knn). Today: Discriminant Analysis: For continuous explanatory variables only.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
. . . . . . Discrimination for Continuous Explanatory Variables Discriminant functions are the essence of the output from a discriminant analysis. Discriminant functions are the linear combinations of the standardised independent variables which yield the biggest mean differences between the groups. If the response is a dichotomy(only two classes to be predicted) there is one discriminant function; if the reponse variable has k levels(ie there are k classes to predict), up to k-1 discriminant functions can be extracted, and we can test how many are worth extracting.
. . . . . . Discriminant Functions Successive discriminant functions are orthogonal to one another, like principal components, but they are not the same as the principal components you would obtain if you just did a principal components analysis on the independent variables, because they are constructed to maximise the differences between the values of the response, not the total variance, but the variance between classes. The initial input data do not have to be centered or standardized before the analysis as is the case in principal components, the outcome of the ±nal discriminant analysis will not be affected by the scaling.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
. . . . . . Discriminant Functions A discriminant function, also called a canonical root, is a latent variable which is created as a linear combination of discriminating (independent) variables, such that L = b 1 x 1 + b 2 x 2 + ... + b p x p + c, where the b's are discriminant coef±cients, the x's are discriminating variables, and c is a constant. This is similar to multiple regression, but the b's are discriminant coef±cients which maximize the distance between the means of the criterion (dependent) variable. Note that the foregoing assumes the discriminant function is estimated using ordinary least-squares, the traditional method, but there is also a version involving maximum likelihood estimation.
. . . . . . Least Squares Method of estimation of Discriminant Functions The variance covariances matrix can be decomposed into two parts: one is the variance within each class and the other the variability between clases, or we can decompose the sum of squares and cross products (the same up to a constant factor) T = B + W T = X ( I n - P 1 n ) X B = X ( Pg - P 1 n ) X between-class W = X ( I n - Pg ) X within I n is the identity matrix. P1n is the orthogonal projection in the space 1 n . (i.e. P 1 n = 1 n 1 n / n ). Such that ( I n - P 1 n ) X is the matrix of centered cases. Pg is the matrix projecting onto the subspace generated by the columns of the binary discriminating matrix G. This matrix has g columns and a one on row i and column j if observaton i belongs to group j a Ta = a Ba + a Wa.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
. . . . . .
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 24

20-Discriminant2 - Last Time:Alternative Classication...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online