ISYE6414
Summer 2009
Lecture 10
The ANOVA model
Dr. Kobi Abayomi
July 15, 2010
1
Introduction  The “null” linear model  The ANOVA
model
The
analysis of variance
or ANOVA refers to a collection of procedures for the analysis of
responses from experimental units. The simplest ANOVA problem is referred to as a
single
factor
or
one way
ANOVA and involves analysis either of data sampled from two or more
populations or data in which two or more treatments have been used. As such, the ANOVA
setup is a generalization of the two sample ttest and a special case of the linear model
The characteristic that differentiates the treatments or populations from one another is called
the
factor
and the different treatments are referred to as the
levels
of the factor.
This is
analogous to the use of categorical predictors in the linear model.
The regression parameters are
effects
in ANOVA. In the linear models we have studied thus
far the effects have been
fixed
.
Random Effects
models are used where the parameters are
taken to be random variables.
We’ll begin with the notation that is peculiar to ANOVA, then illustrate the similarity
between ANOVA and the linear model.
1
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
2
Single Factor or One Way ANOVA
2.1
Setup and Notation
Briefly, say a manager wants to investigate if sales strategies differ across marketing strategy.
There are three strategies for marketing: Convenience, Quality and Price. We are given 20
weeks of sales data; the number of items sold per week stratified by marketing strategy. As
always we introduce some notation. Let:
j
≡
An index for treatments or populations being compared.
K
≡
The number of treatments in total. Here
K
= 3.
i
≡
An index for the observations.
n
j
≡
The number of observations in each treatment.
μ
j
≡
The mean of population or treatment j. Here
i
= 1 is the convenience strategy,
i
= 2
is the quality strategy,
i
= 3 is the price strategy. Of course
x
j
is the sample mean of the
jth
treatment
or strategy.
We seek to test for a difference in strategies.
Our null hypothesis is, then, that there is
no difference in strategies vs.
an alternative that there is at least one difference between
strategies. In notation...:
H
o
:
μ
1
=
μ
2
=
μ
3
H
a
: At least two means differ.
Now we need, of course, a test statistic  or a function of the observed values that we will
link to some probability distribution. Let’s introduce some more notation...
X
i,j
= the random variable that denotes the
ith
observation on the
jth
treatment. What is
X
2
,
2
?
x
i,j
= the observed value of
X
i,j
when the experiment is performed or the data is recorded.
The individual treatment means, that is the mean across treatments for each observation
are calculated..
x
i,.
=
∑
k
j
=1
x
i,j
k
The individual sample means, that is the means within treatments is calculated..
This is the end of the preview.
Sign up
to
access the rest of the document.
 Fall '08
 Staff
 Regression Analysis, Variance, Null hypothesis

Click to edit the document details