ISYE 2028 A and B
Spring 2009
Lecture 16
Dr. Kobi Abayomi
April 13, 2009
1 Introduction  The ”simplest” model  The ANOVA model
In studying methods for the analysis of quantitative data, we ﬁrst focused on problems involving a
single sample of numbers and then turned to a comparative analysis of two diﬀerent such samples.
In one sample problems, the data consisted of observations of individuals randomly selected from
a single population.
In two sample problems, either the two samples were drawn from two diﬀerent populations, or else
two diﬀerent treatments were applied to elements selected from a single population.
The
analysis of variance
or ANOVA refers to a collection of procedures for the analysis of responses
from experimental units. The simplest ANOVA problem is referred to as a
single factor
or
one
way
ANOVA and involves analysis either of data sampled from two or more populations or data in
which two or more treatments have been used. As such, the ANOVA setup is a generalization of
the two sample ttest.
The characteristic that diﬀerentiates the treatments or populations from one another is called the
factor and the diﬀerent treatments are referred to as the
levels
of the factor. Let’s begin with an
example.
..
2 Single Factor or One Way ANOVA
2.1 Setup and Notation
Brieﬂy, say a farmer wants to investigate if ﬂower production diﬀers across gardens. There are
three gardens:
A
,
B
, and
C
. We are given 10 weeks of data; the number of ﬂowers grown per week
per garden. As always we introduce some notation. Let:
j
≡
An index for treatments or populations being compared.
K
≡
The number of treatments in total. Here
K
= 3.
1
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Documenti
≡
An index for the observations.
n
j
≡
The number of observations in each treatment.
μ
j
≡
The mean of population or treatment j. Here
i
= 1 is the garden
A
,
i
= 2 is garden
B
,
i
= 3
is the garden
B
. Of course
x
j
is the sample mean of the
jth
treatment
or strategy.
We seek to test for a diﬀerence in gardens. Our null hypothesis is, then, that there is no diﬀerence
in gardens vs. an alternative that there is at least one diﬀerence between gardens. In notation.
..:
H
o
:
μ
1
=
μ
2
=
μ
3
H
a
: At least two means diﬀer.
Now we need, of course, a test statistic  or a function of the observed values that we will link to
some probability distribution. Let’s introduce some more notation.
..
X
i,j
= the random variable that denotes the
ith
observation on the
jth
treatment. What is
X
2
,
2
?
x
i,j
= the observed value of
X
i,j
when the experiment is performed or the data is recorded.
The individual treatment means, that is the mean across treatments for each observation are
calculated.
.
This is the end of the preview. Sign up
to
access the rest of the document.
 Spring '07
 SHIM

Click to edit the document details