This preview shows page 1. Sign up to view the full content.
Unformatted text preview: Statistical Methods I (EXST 7005) Page 121 Oxygen levels in bayous, where randomly selected bayous represent all bayous in the state.
A treatment is FIXED if all possible levels, or all levels of interest, are included in the experiment.
The treatment levels are selected by the investigator and are probably not chosen from a very
large number of possible values.
A fixed treatment estimates the sum of squared fixed effects for the treatments being investigated.
t This is NOT a variance, but the calculation is the same, ∑τ i i =1 2 t −1 . Examples of fixed effects
• Experiment includes all of the 7 rice varieties commonly grown in Louisiana • Beers are limited to the 5 microbreweries in Anchorage, Alaska. There are some treatments that are common and typically fixed.
For example, indicator variables that include all possible levels of a treatment.
• Sex (male, female) • Class (Freshman, Sophomore, Junior, Senior) Ordinal treatments, where data are categorized as large, medium & small or as deep &
shallow or as a high level & a low level.
Before and After
Treatments with a control group
If the treatments are fixed, then inferences are limited to the treatments included in the experiment.
Comparing the individual treatment levels is often of interest, since they are often specifically
chosen by the investigator. If the treatments are random inferences are made to the whole
population that the sample was taken from. Individual “treatment levels” are usually not of
interest. Some treatments can be either fixed or random.
Years – do they represent random variation or do we categorize as “wet & dry”
Months – randomly selected, or do they represent seasons
Sites or locations – random chosen or selected for certain characterisitics? Example
A new insulin preparation is being compared to an older standard and a saline control. Ten
rabbits are administered the preparations and blood sugar is measured after 20 minutes.
What are the treatments? Experimental units? Sampling units? Are the treatments fixed or
random? CRD – the Completely Randomized Design
The analysis of variance we have seen is called the Completely Randomized Design (CRD)
because the treatments are assigned to the experimental units completely at random. The
analysis is also called a “oneway analysis of variance”. Later we will discuss the
Randomized Block Design (RBD).
James P. Geaghan Copyright 2010 Statistical Methods I (EXST 7005) Page 122 Key aspects of the analysis.
Everything is important, but there are some aspects that I consider more important. These are
discussed below.
The 7 steps of hypothesis testing: Understand particularly the hypothesis being tested and the
assumptions we need to make to conduct a valid ANOVA.
Understand the tests of the assumptions, particularly the HOV tests and evaluation of
normality, particularly ShapiroWilks.
Calculations: We will primarily do the ANOVA using SAS. However, it is important to
understand that the calculations, as originally derived by Fisher, were based on the
marginal totals or means, averaging or summing over all observations in the treatment.
This will take on additional significance when we talk about twoway ANOVA.
The ANOVA table: Understand the table usually used to express the results of an Analysis of
Variance. This same table will also be used for regression.
Traditional ANOVA table
Source
Model
Error
Corrected Total DF
4
20
24 Sum of
Squares
838.5976
272.6680
1111.2656 Mean
Square
209.6494
13.6334 F Value
15.38 Pr > F
0.0001 SEE SAS OUTPUT Expected Mean Square
What do we estimate when we calculate a pooled variance estimate (MSE) or the sum of
squared treatment (SSTreatments) effects divided by its d.f.?
The MSE estimates σ2, the random variation for individuals in the population.
If the null hypothesis is true, the MS for Treatments also estimate the same random variation,
σ2. The F value should only reject the null hypothesis α*100% of the time.
But what if the null hypothesis is NOT true? Then, the MSTreatments estimates σ2, PLUS
some additional component due to a treatment effect.
For a random effect this additional component would be called στ2 . This is a variance. For a FIXED effect the additional component is simple the sum of squared effects divided by
2
the d.f., Σ τ i
. This is not a variance component.
t −1 The ANOVA source table with its d.f. and Expected mean squares (for a balanced design).
Note: 1 tailed test, n influences power
Source d.f. EMS Random Treatment t–1 σε2 + nστ2 EMS Fixed
2
σ 2 + n Στ i σε2 σε2 Error t(n–1) Total ε ( t − 1) tn–1 James P. Geaghan Copyright 2010 Statistical Methods I (EXST 7005) Page 123 We could also express our null hypothesis in terms of EMS [ H0: στ = 0 ], particularly for the
random effect since the variance component for treatments may be a value of interest.
2 Since for a fixed effect the individual means are usually of interest, the null hypothesis is
usually expressed in terms of the means ( H0: μ1 = μ2 = μ3 = ... = μt ). Descriptions of posthoc tests
Posthoc or PostANOVA tests! Once you have found out some treatment(s) are “different”,
how do you determine which one(s) are different? If we had done a ttest on the individual pairs of treatments, the test would have been done as
Y1 − Y2
Y1 − Y2
. If the difference between Y1 − Y2 was large
t=
=
1 1
1 1
2
Sp
+
MSE
+
n1 n2
n1 n2
enough, the t value would have been greater than the tcritical and we would conclude that
there was a significant difference between the means. Since we know the value of tcritical
we could figure out how large a difference is needed for significance for any particular
values of MSE, n1 and n2. We do this by replacing t with tcritical and solving for Y1 − Y2 . ( t= ( ) Y1 −Y2
1 1
+
S2
p
n1 n2 ( tcritical MSE ) ) Y1 −Y2
1 1
+
MSE
n1 n2 = ( ( n1 + n1 ) = Y − Y
1 1 2 2 ) , so or Y1 − Y2 = tcritical SY1 −Y2 This value is the exact width of an interval Y1 − Y2 which would give a ttest equal to tcritical. Any
larger values would be “significant” and any smaller values would not. This is called the
“Least Significant Difference”. LSD = tcritical SY −Y 1 2 This least significant difference calculation can be used to either do pairwise tests on observed
differences or to place a confidence interval on observed differences.
The LSD can be done in SAS in one of two ways. The MEANS statement produces a range
test (LINES option) or confidence intervals (CLDIFF option), while the LSMEANS
statement gives pairwise comparisons.
The LSD has an α probability of error on each and every test. The whole idea of ANOVA is to
give a probability of error that is α for the whole experiment, so, much work in statistics
has been dedicated to this problem. Some of the most common and popular alternatives
are discussed below. Most of these are also discussed in your textbook. The LSD is the LEAST conservative of those discussed, meaning it is the one most likely
to detect a difference and it is also the one most likely to make a Type I error when it finds
a difference. However, since it is unlikely to miss a difference that is real, it is also the
most powerful. The probability distribution used to produce the LSD is the t distribution. James P. Geaghan Copyright 2010 Statistical Methods I (EXST 7005) Page 124 Bonferroni's adjustment. Bonferroni pointed out that in doing k tests, each at a probability of
Type I error equal to α, the overall experimentwise probability of Type I error will be NO
MORE than k*α, where k is the number of tests. Therefore, if we do 7 tests, each at
α=0.05, the overall rate of error will be NO MORE than = 0.35, or 35%. So, if we want to
do 7 tests and keep an error rate of 5% overall, we can do each individual test at a rate of
α/k = 0.055/7 = 0.007143. For the 7 tests we have an overall rate of 7*0.007143 = 0.05.
The probability distribution used to produce the LSD is the t distribution.
Duncan's multiple range test. This test is intended to give groupings of means that are not
significantly different among themselves. The error rate is for each group, and has
sometimes been called a familywise error rate. This is done in a manner similar to
Bonferroni, except the calculation used to calculate the error rate is [1(1α)r1] instead of
the sum of α. For comparing two means that are r steps apart, where for adjacent means
r=2. Two means separated by 3 other means would have r = 5, and the error rate would be
[1(1α)r1] = [1(10.05)4] = 0.1855. The value of a needed to keep an error rate of α is the
reverse of this calculation, [1(10.05)1/4] = 0.0127.
Tukey's adjustment The Tukey adjustment allows for all possible pairwise tests, which is
often what an investigator wants to do. Tukey developed his own tables (see Appendix
table A.7 in your book for “percentage points of the studentized range”). For “t”
treatments and a given error degrees of freedom the table will provide 5% and 1% error
rates that give an experimentwise rate of Type I error.
Scheffé's adjustment This test is the most conservative. It allows the investigator to do not
only all pairwise tests, but all possible tests, and still maintain an experimentwise error
rate of α. “All possible” tests includes not only all pairwise tests, but comparisons of all
possible combinations of treatments with other combinations of treatments (see
CONTRASTS below). The calculation is based on a square root of the F distribution, and
can be used for range type tests or confidence intervals. The test is more general than the
others mentioned, for the special case of pairwise comparisons, the statistic is √(t–1)*Ft1,
n(t1) for a balanced design with t treatments and n observations per treatment. Place the posthoc tests above in order from the one most likely to detect a difference (and the
one most likely to be wrong) to the one least likely to detect a difference (and the one least
likely to be wrong). LSD is first, followed by Duncan's test, Tukey's and finally
Scheffé's. Dunnett's is a special test that is similar to Tukey's, but for a specific purpose,
so it does not fit well in the ranking. The Bonferroni approach produces an upper bound
on the error rate, so it is conservative for a given number of tests. It is a useful approach if
you want to do a few tests, fewer than allowed by one of the others (e.g. you may want to
do just a few and not all possible pairwise). In this case, the Bonferroni may be better.
Evaluating the assumptions for ANOVA. We have already discussed some techniques for the evaluation of data for homogeneous
variance. The assumption of independence is somewhat more difficult to evaluate.
Random sampling is the best guarantee of independence and should be used as much as
possible.
The third assumption is normality. The observations are assumed to be normally
distributed within each treatment, but how the treatments come together to form the
dependent variable Yij may cause them to look nonnormal. The best way to test for
normality is to examine the residuals, pooling the normal distribution across the
James P. Geaghan Copyright 2010 ...
View
Full
Document
 Fall '08
 Geaghan,J

Click to edit the document details