1
One Way ANOVA
•
A factor is a variable that has different levels (treatment levels). Each treatment level
(or treatment) corresponds to one treatment group (one study population).
Denote
the number of treatment groups as
k
.
•
The
one-way completely randomized design
is an experiment where there are
n
i
replicated observations of independent experimental subjects for each treatment
i
, and
the observations are
X
i
1
, X
i
2
,
· · ·
, X
in
i
for each treatment
i
,
i
= 1
,
· · ·
, k
.
•
Denote the
sample mean
for each treatment group
i
as
¯
X
i
, where
¯
X
i
=
X
i
1
+
X
i
2
+
· · ·
+
X
in
i
n
i
, i
= 1
,
· · ·
, k
•
Denote the
population mean
for each treatment group
i
as
μ
i
, i
= 1
,
· · ·
, k
.
•
Analysis of variance (ANOVA)
is a method to use the data to compare mean
responses
μ
i
, i
= 1
,
· · ·
, k
for
k
treatment groups.
•
Usually, we are interested in testing whether the mean responses
μ
i
, i
= 1
,
· · ·
, k
are
the same:
H
0
:
μ
1
=
μ
2
=
· · ·
=
μ
k
and the alternative is
H
a
:
μ
i
6
=
μ
j
for at least a pair of (
i, j
), where 1
≤
i, j
≤
k
•
One-way ANOVA
is used to analyze the effect of one factor (with different treatment
levels), and
Two-way ANOVA
is used to analyze the effect of two factors.
Example 1.
Researchers are interested in testing whether a dieting program and an ex-
ercise program have been effective on weight loss for participants.
An SRS with sample
size
n
1
+
n
2
+
n
3
are recruited, and they are randomly assigned to the dieting group (sam-
ple size
n
1
), the exercise group (sample size
n
2
) and a “control” group (for which there
is no diet/exercise program) (sample size
n
3
).
The amount of weight loss after 1 month
of participation of these groups is measured for each subject, and the measurements are
X
i
1
, X
i
2
,
· · ·
, X
in
i
, i
= 1
,
2
,
3
.
We are interested in testing whether the mean weight loss
μ
1
, μ
2
, μ
3
for these three treatment groups are equal.
In order to perform a one-way ANOVA test, there are basic assumptions to be fulfilled:
1

1.
Normality
- Each population (each treatment group) from which a sample is taken
is assumed to be normal.
2.
Independence
of observations - All samples are randomly selected and independent.
3. Equality of variances, called
homoscedasticity
- The populations for different treat-
ment groups are assumed to have
equal standard deviations (or variances)
.
We are interested in comparing multiple mean response under different treatments. The
statistical inferences usually have two steps:
1. An
overall test
to test whether there is any difference among all means under different
treatments,
H
0
:
μ
1
=
μ
2
=
· · ·
=
μ
K
. This test is the ANOVA F-test discussed below.
2. A
follow-up
analysis to carry out pair-wise comparison of means between different
treatment groups. The test is Tukey’s test.
2
ANOVA F-Test
Example 2.
k
= 4
and we have the following dataset from an experiment.
Treatment 1
Treatment 2
Treatment 3
Treatment 4
21
32
22.5
28
19.5
30.5
26
27.5
22.5
25
28
31
21.5
27.5
27
29.5
20.5
28
26.5
30
21
28.6
25.2
29.2
Let
X
ij
= an observation in the dataset:
i
=
i
th
level of the treatment;
j
=
j
th
observation
in a treatment:
subscripts
i = 1
i = 2
i = 3
i = 4
Treatment 1
Treatment 2
Treatment 3
Treatment 4
j = 1
21
32
22.5
28
j = 2
19.5
30.5
26
27.5
j = 3
22.5
25
28
31
j = 4
21.5
27.5
27
29.5
j = 5
20.5
28
26.5
30
j = 6