Discriminant Analysis
James H. Steiger
Department of Psychology and Human Development
Vanderbilt University
James H. Steiger
(Vanderbilt University)
1 / 54

Discriminant Analysis
1
Introduction
2
Classification in One Dimension
A Simple Special Case
3
Classification in Two Dimensions
The Two-Group Linear Discriminant Function
Plotting the Two-Group Discriminant Function
Unequal Probabilities of Group Membership
Unequal Costs
4
More than Two Groups
Generalizing the Classification Score Approach
An Alternate Approach: Canonical Discriminant Functions
Tests of Significance
5
Canonical Dimensions in Discriminant Analysis
6
Statistical Variable Selection in Discriminant Analysis
James H. Steiger
(Vanderbilt University)
2 / 54

Introduction
Introduction
There are two prototypical situations in multivariate analysis that are, in a
sense, different sides of the same coin. Suppose we have identifiable
groups, and they may (or may not) differ in their means (and possibly in
their covariance structure) on one or more response measures.
How can we test whether the groups are significantly different?
If the groups are different, how can we construct a rule that allows us
to accurately assign an individual to one of several groups, depending
on their scores on the response measures?
In this module, we will deal with the second problem, examining, in
detail, a method known as
discriminant analysis
.
However, the first problem, related to a technique known as MANOVA
(Multivariate Analysis of Variance) is closely related to the first.
James H. Steiger
(Vanderbilt University)
3 / 54

Classification in One Dimension
Classification in One Dimension
There are many situations in which we measure a response variable on
a group of people, objects, or situations, and then try to sort these
into one or more groups depending on their score on that variable.
Some examples? (C.P.)
James H. Steiger
(Vanderbilt University)
4 / 54

Classification in One Dimension
Classification in One Dimension – Some Examples
Your response variable is the color of a test strip. You try to sort
individuals into:
1
Pregnant
2
Non-Pregnant
Your response variable is a brief sensation of change of illumination in
a very dark backround. You try to decide whether a very dim signal
light is
1
Present
2
Not Present
You have individuals who are either male or female, and you have
their heights. You try to devise a rule that will, with the highest
possible degree of accuracy, decide only on the basis of height
whether a person is:
1
Male
2
Female
James H. Steiger
(Vanderbilt University)
5 / 54

Classification in One Dimension
A Simple Special Case
A Simple Special Case
As a simple special case, suppose we consider the whole population of
men and women, and imagine that we
knew
that both populations
are normally distributed with standard deviations of 2.5, but men
have a mean of 70, women of 65.