This preview shows page 1. Sign up to view the full content.
Unformatted text preview: Statistical Methods I (EXST 7005) Page 116 Analysis of Variance (ANOVA)
R. A. Fisher – resolved a problem that had existed for some time. The hypothesis to be tested is H0: μ1 = μ2 = μ3 = ... = μk versus the alternative H1: some μ is different. Conceptually, we have separate
(and independent) samples, each giving a mean, and we
want to know if they could have all come from the
same population, or if is more likely that at least one
came from a different population.
One way to do this is a series of t-tests.
If we want to test among 3 means we do 3 tests: 1
versus 2, 1 versus 3, 2 versus 3
For 4 means there are 6 tests. 1–2, 1–3, 1–4, 2–3, 2–4, and 3–4
For 5 means, 10 tests, etc.
This technique is unwieldy, and has other issues. When we do the first test, there is an α
chance of error, and for each additional test another α chance of error. So if you do 3 or 6
or 10 tests, the chance of error on each and every test is α.
Overall, for the experiment, the chance of error for all tests together is much higher than α.
Bonferroni gave a formula that showed that the chance of error would be NO MORE than Σαi. So
if we do 3 tests, each with a 5% chance of error, the overall probability of error is no greater
than 15%, 30 percent for 6 tests, 50% for 10 tests, etc.
Of course this is an upper bound. Other calculations are probably more realistic such as the
calculation α ′ = 1 − (1 − α ) k −1 used by Duncan or α ′ = 1 − (1 − α ) k 2 from the Student- Newman-Keuls calculation (where k is the number of groups to be tested, α is the error rate
for each test and α′ is the error rate for the collection of tests). The table below gives some
probabilities of error calculated by Bonferroni’s, Duncan’s and Student-Newman-Keuls’
formulas for tests done at α = 0.05.
(upper bound) [1–(1– α)k–1]
0.9190 Student-NewmanKeuls [1–(1– α)k/2]
0.7226 The bottom line: Splitting an experiment into a number of smaller tests is generally a poor idea.
This applies at higher levels as well (i.e. splitting big ANOVAs into little ones). The
solution: We need ONE test that will give us an accurate test with an α value of the desired
level. James P. Geaghan Copyright 2010 Statistical Methods I (EXST 7005) Page 117 The concept
∑ (Y − Y )
We are familiar with variance. S 2 = i = 1
γ 1S1 + γ 2S2
SS1 + SS2
We are familiar with the pooled variance S =
(n1 − 1) + (n 2 − 1)
γ 1 +γ 2
p We are familiar with the variance of the means. But we never get “multiple” estimates of the
mean and calculate a variance from those. The calculation we use to get the variance of the
means comes from statistical theory, S Y2 = S . Could we actually get multiple estimates of n the means and calculate a sum of squared deviations of the various means from an overall
mean and get variance of the means from that?
k 2 ∑ (Yi. − Y ..) S
= i =1
Yes, we could, and using the formula SY =
k −1 2 should give the same value. Suppose we have some values from a number of different samples, perhaps taken at different sites.
The values would be Yij, where the sites are i=1, 2, ..., k, and the observations from within the
sites are j = 1, 2, 3, ..., ni. For each site we calculate a value of the mean. We then take the
various means (k different means) and calculate a variance among those. This would also
give the “variance of the means”. The LOGIC
Remember, we want to test
H0: μ1 = μ2 = μ3 = ... = μk
We have a bunch of means and we want to know if they were drawn from the same population or
different populations. We also have a bunch of samples each with its own variance (S2). If we
can assume homogeneous variance (all variances equal) then we could POOL the multiple
estimates of variance. So, to start with we will take the variances from each of the groups and
pool them into one new & improved estimate of variance. This will be the very best estimate
of variance that we will get (if the assumption is met).
Sp = SS1 + SS 2 + SS 3 + SS 4 + SS 5
( n1 − 1) + ( n2 − 1) + ( n3 − 1) + ( n4 − 1) + ( n5 − 1) Now, think about the means. If the NULL HYPOTHESIS IS TRUE, then we could calculate the
variance of the means from the multiple
means. This would estimate S Y2 , the
variance of the means. We would take the
deviations of each Yi. from the overall
mean, Y.. , and get a variance from that. Deviations A B C D E Group James P. Geaghan Copyright 2010 Statistical Methods I (EXST 7005) Page 118 If the null hypothesis is true, the means should be pretty close to the overall mean. They won't be
exactly equal to the overall mean because of random sampling variation in the individual
Y A B C D E Group However, if the null hypothesis is false, then some mean will be different! At least one, maybe
Y A B C D E Group So we take the Sum of squared deviations, divide by the degrees of freedom and we get an
k ∑ (Yi. − Y ..) 2
estimate of the variance of the means, SY = i =1 k −1 2 . But this does not exactly estimate the variance, it estimates the variance of the means, that is the variance divided by the sample
k ∑ (Yi. − Y ..) 2
size! The sample size is the number of observations in each mean. SY = i =1 2 k −1 = S2
n In order to estimate the variance we must multiply this estimate by n, the sample size,
nS Y2 = nS 2
= S 2 , giving a second estimate of the variance. This is obviously easier if each
n sample size is the same (i. e. the experiment is balanced). We will usually use the
calculations for a balanced design, but the analysis can readily be done if the data is not
balanced. It's just a little more complicated. James P. Geaghan Copyright 2010 Statistical Methods I (EXST 7005) Page 119 The Solution
So what have we got?
One variance estimate that is pooled across all of the samples because the variances are equal
(an assumption, sometimes testable). This is the best estimate of random error.
And another variance that should be the same IF the null hypothesis is TRUE.
The second mean (from the variances) may not be the same if the null hypothesis is false,
depending on how great the departure from the null hypothesis. Not only will the second
variance from the mean not be the same, IT WILL BE LARGER! Why? Because when we are
testing means for equality we will not consider rejecting if the means are too similar, only
if they are too different and large differences in means yield large deviations which
produce an overly large variance. So this will be a one tailed test.
And how to we go about testing these two variances for equality? Testing for equality of
variances requires an F-test, of course.
H0: μ1 = μ2 = μ3 = ... = μk is true, then S p = nSY2 2
If H1: some μi is different, then S p < nSY For a one tailed F test we put the ONE WE EXPECT TO BE LARGER IN THE NUMERATOR. F= 2
Sp And that is Analysis of Variance.
We are actually testing means, but we are doing it by turning them into variances; one pooled
variance from within the groups, called the “pooled within variance” and one variance
from between groups or among groups called the “variance among groups” or “between
group variance”. If the variances are not significantly different as judged by the F test,
then we cannot reject the null hypothesis. It is possible, as usual, that we make a Type II
error with some unknown probability (β). If the variances are judged to not be the same,
then the null hypothesis is probably not true. Of course we may have made a Type I error,
with a known probability of α.
Some of the calculations later, but this is the basic idea. R. A. Fisher
Ronald Aylmer Fisher is sometimes called the father of modern
statistics. Some of his major contributions include the development
of the basics of design of experiments and Analysis of Variance.
Born in London 1890, he had very poor eyesight that prevented him
from learning by electric light. He had to learn by having things
read out to him. He developed the ability view problems
geometrically and to figure mathematical equations in his head. In
1909 he won a scholarship to Cambridge. James P. Geaghan Copyright 2010 Statistical Methods I (EXST 7005) Page 120
He left an academic position teaching mathematics for a position
at Rothamsted Agricultural Experiment Station. In this
environment he developed many applied analyses for testing
experimental hypotheses (Analysis of Variance, circa 1918), and
provided much of the
foundation for modern
statistics. We will see other analyses (in
addition to ANOVA)
developed by Fisher. Some other contributions by Fisher include
the first use of the term “null hypothesis”, development of the F
distribution, of the Least Significant Difference, maximum
likelihood estimation and contributed to the early use
nonparametric statistics. Terminology used in Analysis of Variance
Treatment – different experimental populations that are contained in an experiment and undergo
some application or manipulation by the experimenter
Control or check – a “treatment” that receives no experimental manipulation
Experimental Unit – the unit to which a treatment is applied
Sampling Unit – the unit that is sampled or measured
The linear model is given by Yij = μi + εij or Yij = μ +τ i + εij ˆ
where τi = (μi − μ. ) is estimated by τ i = (Yi. − Y.. )
The calculation of treatment Sum of Squares for treatments is a sum of the squared treatment
t 2 effects SSTreatments = n∑ (Yi. − Y.. ) .
i =1 The calculation of treatment Mean Square is a sum of squared effects divided by the degrees of
t freedom. A variance? MSTreatments = n ∑ (Yi. − Y.. )
i =1 t −1 2 t = n∑ τ i2
i =1 t −1 A random treatment effect estimates a variance component. In order for treatments to be random,
they should be a random selection from a large (theoretically ∞) number of treatments.
Inferences developed from random treatments are for all the possible treatment levels. Examples of random effects
The term used for the error in an experiment are always random. They represent random
variation. This variation comes from the experimental unit and sometimes the sampling
Compare production rice varieties, where rice varieties represent a random sample from the
world's rice varieties.
Estimate the alcohol content of beer, where the beers tested are randomly sampled from all the
beers in the population of interest (world, national).
James P. Geaghan Copyright 2010 Statistical Methods I (EXST 7005) Page 121 Oxygen levels in bayous, where randomly selected bayous represent all bayous in the state.
A treatment is FIXED if all possible levels, or all levels of interest, are included in the experiment.
The treatment levels are selected by the investigator and are probably not chosen from a very
large number of possible values.
A fixed treatment estimates the sum of squared fixed effects for the treatments being investigated.
t This is NOT a variance, but the calculation is the same, ∑τ i i =1 2 t −1 . Examples of fixed effects
• Experiment includes all of the 7 rice varieties commonly grown in Louisiana • Beers are limited to the 5 micro-breweries in Anchorage, Alaska. There are some treatments that are common and typically fixed.
For example, indicator variables that include all possible levels of a treatment.
• Sex (male, female) • Class (Freshman, Sophomore, Junior, Senior) Ordinal treatments, where data are categorized as large, medium & small or as deep &
shallow or as a high level & a low level.
Before and After
Treatments with a control group
If the treatments are fixed, then inferences are limited to the treatments included in the experiment.
Comparing the individual treatment levels is often of interest, since they are often specifically
chosen by the investigator. If the treatments are random inferences are made to the whole
population that the sample was taken from. Individual “treatment levels” are usually not of
interest. Some treatments can be either fixed or random.
Years – do they represent random variation or do we categorize as “wet & dry”
Months – randomly selected, or do they represent seasons
Sites or locations – random chosen or selected for certain characterisitics? Example
A new insulin preparation is being compared to an older standard and a saline control. Ten
rabbits are administered the preparations and blood sugar is measured after 20 minutes.
What are the treatments? Experimental units? Sampling units? Are the treatments fixed or
random? CRD – the Completely Randomized Design
The analysis of variance we have seen is called the Completely Randomized Design (CRD)
because the treatments are assigned to the experimental units completely at random. The
analysis is also called a “one-way analysis of variance”. Later we will discuss the
Randomized Block Design (RBD).
James P. Geaghan Copyright 2010 ...
View Full Document
This note was uploaded on 12/29/2011 for the course EXST 7005 taught by Professor Geaghan,j during the Fall '08 term at LSU.
- Fall '08