STAT E-50
- Introduction to Statistics
One-Way Analysis of Variance
To compare two or more groups:
•
To test two proportions, use a z-test
•
To test more than two proportions, use a
χ
2
test
•
To test two means, use a
t-test
•
To test more than two means, use Analysis of Variance (ANOVA)
In Analysis of Variance, we are
testing for the equality of the means of several levels of
a variable. The technique is to compare the variation
between
the levels and the
variation
within
each level. (The levels of the variable are also referred to as groups, or
treatments.)
If the variation due to the level (variation
between
levels) is significantly larger than the
variation
within
each level, then we can conclude that the means of the levels are not all
the same.
We will test the hypothesis H
0
:
μ
1
=
μ
2
= … =
μ
k
vs. H
a
:
the means are not all equal
using the ratio:
differences among means
variability within the groups
When the numerator is large compared to the denominator, we will reject the null
hypothesis.
•
The numerator measures the variation
between
groups; this is called the
Treatment Mean Square, or MS
T
•
The denominator measures the variation
within
groups; this is called the Error
Mean Square, or MS
E
•
The test statistic is
T
E
MS
F =
MS
;
reject H
0
when F is large.
•
MS
T
has k - 1 degrees of freedom, where k = the number of groups
•
MS
E
has k(n
- 1) degrees of freedom, where n = the number of observations
from each group

This ** preview** has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*
Assumptions and Conditions
First plot the data in side-by-side boxplots.
Look for outliers, similar spreads, similar
centers.
Independence Assumption
The groups must be independent of each other.
The data within each group must be independent data drawn at random from a
homogeneous population, or generated by a randomized comparative experiment.
Randomization Condition:
Was the data collected randomly?
Equal Variance Assumption
We need the variances of the treatment groups to be equal, so we can find a
pooled variance
Similar Variance Condition:
•
Compare the spreads of the groups in the side-by-side boxplots
•
Are the spreads changing systematically with the centers?
•
Plot the residuals vs. the predicted values to be sure you don’t have
larger residuals for larger values
Normal Population Assumption
Nearly Normal Condition:
Compare the boxplots for skewness
Pool the residuals and create a histogram or NPP
The
ANOVA
table:
Source
df
SS
MS
F
p
Factor
Error
Total
T
T
T
SS
MS
=
df
E
E
E
SS
MS
=
df
T
E
MS
F =
MS
Page 2