{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

Machine+Learning+Neural+and+Statistical+Classification_Part10

# Machine+Learning+Neural+and+Statistical+Classification_Part10

This preview shows pages 1–2. Sign up to view the full content.

Sec. 9.6] Measures 173 number of attributes is fairly large in relation to the number of examples per class, and partly because Mardia’s statistic is less efficient than the univariate statistics. 9.6.3 Head injury Among the datasets with more than two classes, the clearest evidence of collinearity is in the head injury dataset. Here the second canonical correlation is not statistically different from zero, with a critical level of = 0.074. It appears that a single linear discriminant is sufficient to discriminate between the classes (more precisely: a second linear discriminant does not improve discrimination). Therefore the head injury dataset is very close to linearity. This may also be observed from the value of fract1 = 0.979, implying that the three class means lie close to a straight line. In turn, this suggests that the class values reflect some underlying continuum of severity, so this is not a true discrimination problem. Note the similarity with Fisher’s original use of discrimination as a means of ordering populations. Perhaps this dataset would best be dealt with by a pure regression technique, either linear or logistic. If so, Manova gives the best set of scores for the three categories of injury as (0.681,-0.105,-0.725), indicating that the middle group is slightly nearer to category 3 than 1, but not significantly nearer. It appears that there is not much difference between the covariance matrices for the three populations in the head dataset (SD ratio = 1.1231), so the procedure quadratic discrimination is not expected to do much better than linear discrimination (and will probably do worse as it uses many more parameters). 9.6.4 Heart disease The leading correlation coefficient cancor1 = 0.7384 in the heart dataset is not very high (bear in mind that it is correlation that gives a measure of predictability). Therefore the discriminating power of the linear discriminant is only moderate. This ties up with the moderate success of linear discriminants for this dataset (cost for the training data of 0.32). 9.6.5 Satellite image dataset The satellite image data is the only dataset for which there appears to be very large correlations between the attributes (corr.abs = 0.5977), although there may be some large correlations in the vehicle dataset (but not too many presumably) since here corr.abs = 0.4828. Note that only three linear discriminants are sufficient to separate all six class means (fract3 = 0.9691). This may be interpreted as evidence of seriation, with the three classes “grey soil”, “damp grey soil” and “very damp grey soil” forming a continuum. Equally, this result can be interpreted as indicating that the original 36 attributes may be successfully reduced to three with no loss of information. Here “information” should be interpreted as mean square distance between classes, or equivalently, as the entropy of a normal distribution.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}