This preview shows page 1. Sign up to view the full content.
Unformatted text preview: Statistical Techniques II Page 115 Treatment Arrangements
Sometimes the treatment simply consists of a list of levels that the investigator is interested in
We will term this type of treatment arrangement a single factor treatment arrangement or an “a
priori” treatment arrangement.
These are often fixed treatment levels that the investigator wants to examine, but they may be
There are several other possibilities.
Cross classified (factorial, two-way ANOVA)
Like treatments with blocks, two treatments can be cross-classified.
Nested treatment arrangement
A factorial arrangement of treatments occurs when we have two (or more) treatments of interest
arranged such that each level of the first occurs with each level of the second. All possible
combinations of the two treatments exist.
Examples - Examine the effect of three dietary supplements (a, b & c) on weight gain for males and
females. Each sex gets the same three diets (6 combinations)
Examine the effectiveness of three pre-emergence herbicides and four post-emergence
herbicides. All of the 12 combinations exist, each treatment may have a null treatment as a
The other type of treatment arrangement is the nested treatment arrangement. Nested treatment
arrangements occur when each level of some treatment occurs in combination with some other
treatment, but the levels of the second treatment are not the same for each level of the first
Examples Examine the effect of three dietary supplements on weight gain for males and females. Each
sex gets three diets, but the diets are different for males (a, b & c) and females (d, e & f).
Examine the effectiveness of four post-emergence herbicides on three different crops. The
approved post emergence herbicides are not the same for the three crops.
Factorial B1 B2 B3 B4 A1 a1b1
A2 a2b1 a2b2 a2b3 a2b4 A3 a3b1
A2B4 A3B7 A2B5 A2B6 A3B8
A3B9 James P. Geaghan - Copyright 2011 Statistical Techniques II Page 116 Nested treatment arrangements are not too common. They can occur.
For example, if we wanted to test for differences in attendance at State Parks. We choose 4
parks in TX, 5 in LA and 3 in MS. There is no “match” for the parks between states. We could
measure attendance on randomly chosen dates and our model would be
MODEL Y = STATE PARK(STATE);
Another example. Suppose we wanted to test for the effectiveness of various commonly used
herbicide on major crops in LA by examining dollar value per acre. We choose crops (Cane,
Rice, Soy and Corn). We select representative fields at random and treat with an appropriate
Unfortunately, the same herbicides are not used on these crops. Corn and Cane are
grasses, and the herbicides target “broadleaf” plants. Soybean is a broadleaf plant, so it
requires different herbicides. Rice is grown in water and requires special herbicides.
So, each crop has it's own suite of herbicides.
MODEL Y = CROP HERB(CROP);
Factorial designs are VERY common, popular and highly recommended.
This treatment arrangement also has some unique properties and interpretations (especially
We will concentrate on this treatment arrangement.
Treatment Interactions The one really different thing about treatments is that we are interested in them (as opposed to
blocks and nested error terms).
We may want to test the individual levels. This will be our major topic following treatment
We are also likely to be interested in the INTERACTION!
Interactions This is new and VERY important. Block & treatment interactions are “error”, and not of interest.
However, treatment interactions measure how consistent one treatment is across the levels of
another. This is interesting and important. It cannot be ignored.
Look at the table below. What value belongs in the missing cell? Treatment 1 a b c d a 3 6 2 5 Treatment 2 b c 5 7 8 10 ???? 6 7 9 James P. Geaghan - Copyright 2011 Statistical Techniques II Page 117 The missing value is 4!!! How did you know?
Could it be 16? Treatment 1 Treatment 2 a b c a 3 5 7 b 6 8 10 c 2 16 6 d 5 7 9 Could it be 1? Treatment 1 Treatment 2 a b c a 3 5 7 b 6 8 10 c 2 1 6 d 5 7 9 Of course it can be any value it wants to be. There are no restrictions. However, if it is any value
other than 4, then there is an interaction.
Cell values with marginal values; note additivity. Treatment 1 Treatment 2 a b c Mean Effect a 3 5 7 5 ‐1 b 6 8 10 8 2 c 2 4.00 6 4 –2 d 5 7 9 7 1 Mean 4 6 8 6 Effect –2 0 2 If we plot the data and there is no interaction, the lines connecting the means should be parallel.
8 T2 c
T2 b 6
0 T2 a T1a T1b T1c T1d James P. Geaghan - Copyright 2011 Statistical Techniques II Page 118 If an interaction is present the lines are not parallel.
8 T2 c
T2 b 6
0 T2 a T1a T1b T1c T1d And may even cross.
8 T2 c
T2 b 6
0 T2 a T1a T1b T1c T1d So how do we interpret an interaction? If there is no interaction the behavior of the treatments is consistent. The means increase and
decrease by the same amount.
If there is an interaction, increases and decreased in the means are unpredictable and cannot be
foreseen by the main effects.
Of course, in practice no lines are ever EXACTLY parallel. The means never increase and
decrease by EXACTLY the same amount.
So we need a statistical test to determine if the departure is statistically meaningful; if the
interaction is “significant”.
No problem. We make the interaction a source in our model and test it.
But note one key factor. Blocks had interactions with treatments. We calculated those, and tested
if we wanted
However, interactions with blocks are usually not of interest, they are simply a measure of
Treatment interactions are of great interest, because if our treatments are not consistent we must
know how they change to make our conclusions.
Factorial treatment arrangement We will be especially concerned with factorial treatment arrangements. These are very common.
R. A. Fisher pointed out that these designs had “hidden replication”.
For example, suppose we have a 4 by 5 factorial treatment arrangement with 2 replicate
observations in each of the 20 treatment combinations.
Number of replicates per treatment combination. Note that treatment mean comparisons have
more reps. James P. Geaghan - Copyright 2011 Statistical Techniques II Page 119 Treatment 2 Treatment 1 a b c d Sum a 2 2 2 2 8 b 2 2 2 2 8 c 2 2 2 2 8 d 2 2 2 2 8 e 2 2 2 2 8 Sum 10 10 10 10 40 How important are interactions? If we have significant main effects and significant interactions,
can we ignore one?
Let’s examine some graphs for an experiment. Suppose we are trying to determine the best of
three herbicides (1, 2, 3) to control weeds on five soil types (a, b, c, d and e).
No interaction. Herbicide 3 is best on every soil type.
Weed control 40
35 3 30 2
15 a b
Soil type d e Measure of weeds Interaction present. Herbicide 3 is still best on every soil type.
35 3 30
25 2 20
Soil type d e Sometimes the interaction is significant, but one main effect stands out anyway. Other times the
interaction is so strong that that the best results for each treatment 1 depends on the combination
with treatment 2.
Interaction. Which herbicide is best?
Weeds control 40
15 a b
Soil type d e The bottom line. Unlike treatment and block interactions, treatment interactions are not
“assumed” away! Test them, and be prepared to examine them. James P. Geaghan - Copyright 2011 Statistical Techniques II Page 120 Environmental impact assessment Lets look at another case of interaction. Suppose we are constructing a power plant, and plan to
dump cooling water into a river.
We want to determine if there is an impact on the growth of Channel catfish in the river. We
measure growth by sampling otoliths from small catfish. Heat We sample above the power plant and below the power plant to see if the growth is different.
“Upstream - downstream” should detect impact. Sample sites Heat But then we are told that this will mean nothing. Growth downstream has always been
different from growth upstream. Better habitat, nutrition, etc.
So we try another tactic, we sample for a few years before the plant goes into operation and for a
few years after the plant goes into operation. Surely “before-after” will detect impact.
Not necessarily. Maybe the years before were wet “El Niño” years and the years after were dry
“La Niña” years. Or maybe something happens way upstream at the same time our power plant
is finished! Then any observed changes would not be due to our power plant.
So how do we sample impact? We must detect an interaction. In this case the ONLY term of real
interest for detecting impact is the interaction. The main effects are not useful in detecting
impact!! Upstream Downstream Before 24 32 After 27 25 Terminology. Additivity - Take a cell in a factorial treatment arrangement with an overall mean of 10.
If the EFFECT for treatment 1 is “5” and the effect for treatment 2 is “–3”, the value in the cell
should be overall mean + effect 1 + effect 2 = 10 + 5 – 3 = 12.
We get the cell by ADDing the effects. This is additivity.
This will not work if there is an interaction.
Interactions are sometimes referred to as tests of additivity.
For the model Yijk 1i 2 i 1 2 ij ijk
There is no interaction if Yij . Yi.. Y. j . Y... 0 James P. Geaghan - Copyright 2011 Statistical Techniques II Page 121 Note that this calculation was done on means ( Y ), not effects ().
For cell T1c, T2b: 4–6–4+6 = 0, indicating no interaction Treatment 1 Treatment 2 a b c Mean Effect a 3 5 7 5 –1 b 6 8 10 8 2 c 2 4.00 6 4 –2 d 5 7 9 7 1 Mean 4 6 8 6 Effect –2 0 2 0 Multiplicative models (Chi square analysis and log-linear models). Drug A saves 50 percent of fish with a certain fungus.
Drug B saves 50 percent of fish with the same fungus.
Giving Drug A and Drug B together should save what percent?
100%, 75%, 50%, 25%, 0%?
For an additive model, the answer is 100%. If we have 100 fish and Drug A and Drug
B both save 50 of 100, then all fish will be saved.
In a proportional or multiplicative model, Drug A saves 50%, adding Drug B will save
50% of the remaining fish for a total of 75%.
We will not be working with these models, but you should be aware of them.
Chi square tests of independence test for proportional interactions, not additive interactions.
Log-linear models (which we saw for regression) can be applied to ANOVA (by taking the log
of Yi), and test for multiplicative effects.
EMS for Treatments Expected mean squares for treatments, nested or cross-classified, work exactly the same as for
nested error terms or cross-classified blocks.
The only difference is that treatments may well be both fixed, while blocks are random. This
will be the only real new consideration.
The source table for this CRD, with sources, degrees of freedom and EMS is given below. The
treatments are single factor, either fixed or random.
Source d.f. SS MS EMS Tmt t–1 SSTmt MSTmt n 2 Error Total t(n–1) tn–1 SSE SSTotal MSE 2 2 James P. Geaghan - Copyright 2011 Statistical Techniques II Page 122 CRD with fixed effect treatments.
Source d.f. SS MS Tmt t–1 SSTmt MSTmt Error Total t(n–1) tn–1 SSE SSTotal EMS n i2 (t 1) 2 MSE 2 The design below has 4 nested levels. The top line is a treatment, the bottom an error. The two
others could be either.
Nested treatments are not common. Note that a fixed effect can be represented as Q
Source d.f. EMS Tmt t–1 2 n 2 ns2 Q B(Tmt) t(p–1) 2 n 2 ns 2 C(B*Tmt) tp(s–1) 2 n 2 Rep(C*B*tmt) Total tps(n–1) tpsn–1 2 The source table for an RBD.
Source d.f. SS MS EMS Tmt t–1 SSTmt MSTmt 2 b 2 Block b–1 SSBlk MSBlk 2 2 t Error Total (t–1) (b–1) tb–1 SSE SSTotal MSE 2 The source table for an Factorial. Do not ever do the experiment below!!! There is no test of
the interaction because there is no error term!
Source d.f. SS EMS Tmt 1 t1 – 1 SSTmt1 2 t22 SSTmt2 t1 Tmt 2 t2 – 1 1 2 2 1 2 Tmt1*Tmt2 (t1–1)( t2–1) SSInteraction t1t2 – 1 SSTotal 2 Total 1 2 2 1 2 We can do experiments with one block and one treatment, because the interaction is an error
term. We cannot do experiments with just two treatments. We need replicate experimental
units within treatments to test for interactions.
The previous “bad” model would be Yij 1i 2i 1 2 ij
The “good” model would be Yijk 1i 2 i 1 2 ij ijk James P. Geaghan - Copyright 2011 Statistical Techniques II Page 123 Source d.f. SS EMS Tmt 1 t1–1 SSTmt1 2 n2 nt22 SSTmt2 n nt1 SST1T2 n Tmt 2 t2–1 T1 * T2 (t1–1) (t2–1) Error Total t1t2(n–1) t1t2n–1 1 2 2 1 2 2 1 2 2 2 2 1 2 2 SSE SSTotal The EMS for treatments work the same as for blocks and treatments. There is however one
really big consideration remaining.
Blocks are random, treatments are either random or fixed, and this will affect our tests.
Note that on the preceding page the treatment interaction was actually the error term for
the treatment main effects.
This is true, and it is not a problem. SAS PROC MIXED (and PROC GLM with appropriate
options) will do the correct tests as long as you specify that the treatments are random.
If one treatment is fixed and one is random, nothing changes for this example, since the
interaction of a random effect and a fixed effect is still random.
The test of the main effects is still done with the interaction.
Source d.f. SS EMS Tmt 1 t1–1 SSTmt1 2 n2 Q SSTmt2 n nt12 n Tmt 2 t2–1 T1*T2 (t1–1) (t2–1) SST1T2 Error Total t1 t2(n–1) t1t2n–1 SSE SSTotal 1 2 2 1 2 1 2 2 2 2 1 2 2 HOWEVER, if BOTH effects are fixed, then the interaction is also FIXED. Fixed effects occur
only on their own source line, not in any other sources!
This makes a BIG difference!!!
Note that when all treatments are fixed we do not have the interaction as part of the main
Source d.f. SS EMS Tmt 1 t1–1 SSTmt1 2 n2 Q SSTmt2 n Q SST1T2 Q Tmt 2 T1 * T2 Error Total t2–1 (t1–1) (t2–1) t1 t2(n–1) t1t2n–1 SSE SSTotal 1 2 2 1 2 1 2 2 2 1 2 2 James P. Geaghan - Copyright 2011 Statistical Techniques II Page 124 Now, what is the correct error term for the treatments and interactions?
Source d.f. SS EMS Tmt 1 t1–1 SSTmt1 2 Q SSTmt2 Q SST1T2 Q Tmt 2 t2–1 T1 * T2 (t1–1) (t2–1) Error Total t1t2(n–1) t1t2n–1 1 2 2 2 1 2 2 SSE SSTotal Right, the experimental error term!
With a random effects model or mixed model, interactions are error terms. However with
all effects fixed, the experimental error is the error for both main effects and interactions.
Note that this is what SAS does by default in PROC GLM, so these tests are available by
default in some models.
What if you have both experimental error and sampling error?
Now the experimental error must be used, and the “sampling error” is the residual error.
You may specify a TEST statement to make the tests, or rely on the PROC GLM random
statement with the /test option, or PROC MIXED.
Source d.f. EMS Tmt 1 t1–1 2 n 2 Q t2–1 n Q (t1–1) (t2–1) n Q E. Error t1t2(s–1) n S. Error Total t1t2s(n–1) t1t2sn–1 Tmt 2 T1*T2 1 2 2 2 2 2 1 2 2 2 2 Missing cells Factorial with missing cells (don't ever have missing cells, and if you do, don't use Type IV SS in
SAS unless you really know what you are doing)! B1 B2 B3 B4 A1 a1b1 . a1b3 a1b4 A2 a2b1 a2b2 a2b3 a2b4 A3 a3b1 a3b2 . a3b4 And if you are using SAS TYPE IV SS, you probably do not know what you are doing.
Missing cells are not an issue IF THERE IS KNOWN TO BE NO INTERACTION. SAS TYPE
III SS basically gives the result assuming no treatment interaction.
If there is an interaction, there is no proper test, the treatments cannot be separated.
If you have missing cells you can take all treatment combinations as a single treatment and do
selected contrasts. We will discuss contrasts soon.
I hate experiments with missing cells.
James P. Geaghan - Copyright 2011 Statistical Techniques II Page 125 Tmt Arrangement Examples (see SAS output handout Appendix 18)
Summary There are three types of treatment arrangement. Single factor - very common and relatively simple
Factorial - the most common and important.
Nested - not so common, but can occur A major new development with factorial treatment arrangements is the consideration of
We now have a more serious consideration of whether a treatment is FIXED or RANDOM.
Selecting the appropriate error term depends on this determination.
Missing cells are a serious no-no.
SAS Tmt Arrangement example (Appendix 18): from Snedecor & Cochran, 1980 (pg 305). Dependent variable - Rat weight gain
treatments - factorial arrangement protein source (beef, pork, cereal) and protein level in the
diet (with & without).
Treatments most likely should be fixed, so TYPE III SS will give the correct test results.
This is the EMS for this experiment. The TYPE III SS tests are correct.
Source d.f. SS EMS Tmt 1 t1–1 SSTmt1 Tmt 2 t2–1 SSTmt2 T1 * T2 (t1–1) (t2–1) SST1T2 Error Total t1t2 (n–1) t1t2n–1 SSE SSTotal 2 nt2 12i (t1 1) 2 2 nt1 2 j (t2 1) 2 2 n ( 1 2 )ij (t1 1)(t2 1) 2 Test results
Source d.f. P>F Tmt 1 1 0.0003 Tmt 2 2 0.5411 T1 * T2 2 0.0732 Error Total 54 59 EMS 2 nt2 12i (t1 1) 2 nt1 22 j (t2 1) 2 2 n ( 1 2 )ij (t1 1)(t2 1) 2 The interactions were not quite significant, but were perhaps a little too large to ignore entirely
(P>F = 0.07).
From the plots (next 2 pages) it appears that growth is enhanced by high levels of beef & pork
protein sources while the high protein level with cereal does not enhance growth much, and is
about the same as the low levels of all 3 sources. James P. Geaghan - Copyright 2011 Statistical Techniques II Page 126
WEIGHT GAIN IN RATS ON VARIOUS DIETS
FACTORIAL DESIGN (2 POR 3) WITH REPLICATES
Plot with 2x standard errors to examine interaction Weight gain 110 100 90 80 70
BEEF CEREAL PORK Protein source
WEIGHT GAIN IN RATS ON VARIOUS DIETS
FACTORIAL DESIGN (2 POR 3) WITH REPLICATES
BLOCK CHART TO EXAMINE INTERACTIONS
79.2 83.9 78.7 HIGH
100 BEEF 85.9 CEREAL 99.5 PORK SOURCE SAS 2x2x2 factorial example (Appendix 18): from Snedecor & Cochran, 1980 (pg 359). Dependent variable - Hog weight gain
treatments - factorial arrangement of hog sex, high & low levels of protein and with or without a
lysine dietary supplement.
Treatments most likely should be fixed, so TYPE III SS will give the correct test results.
Source d.f. EMS Tmt 1 t1–1 Tmt 2 t2–1 Tmt 3 t3–1 T1*T2 (t1–1) (t2–1) T1*T3 (t1–1) (t3–1) T2*T3 (t2–1) (t3–1) T1*T2*T3 (t1–1) (t2–1) (t3–1) (t1 1)(t2 1) 2 2 nt2 ( 1 3 )ik (t1 1)(t3 1) 2 nt1 ( 2 3 ) 2jk (t2 1)(t3 1) 2 2 n ( 1 2 3 )ijk (t1 1)(t2 1)(t3 1) Error t1t2t3(n–1) 2 2 nt2t3 12i (t1 1) 2 2 nt1t3 2 j (t2 1) 2 nt1t2 32k (t3 1) 2 2 nt3 ( 1 2 )ij James P. Geaghan - Copyright 2011 Statistical Techniques II Page 127 Source d.f. Lysine 1 2
0.7069 nt 2t3 1i (t1 1) Protein 1 2
<.0001 L*P 1 2
0.0012 Sex 1 0.1143 L*S 1 2
0.1983 P*S 1 0.9533 L*P*S 1 2
0.4783 Error 56 P>F EMS 2 nt t (t 1) nt t (t 1) nt ( ) (t 1)(t 1) nt ( ) (t 1)(t 1) nt ( ) (t 1)(t 1) n ( ) (t 1)(t 1)(t 1) 1 3 2
2j 2 1 2 2
3k 3 3 1 2 2 2
1 3 ik 1 3 1 2 2
1 2 ij 2
2 3 jk 2 3 2
1 2 3 ijk 1 2 3 2 Note that sex and its interactions are not significant. There is no need to interpret or further
consider sex differences for this experiment.
Lysine is not significant, but protein and the interaction with Lysine IS significant, so lysine does
have some effect.
Examine the plots to determine the nature of the interaction.
PIG WEIGHT GAIN WITH DIET SUPPLEMENTS
FACTORIAL DESIGN (2x2x2) WITH REPLICATES
BLOCK CHART TO EXAMINE INTERACTIONS 14 PROTEIN 1.4 1.2 12
0.6 LYSINE Character graphics from SAS.
LYSINE James P. Geaghan - Copyright 2011 Statistical Techniques II Page 128 What would happen if the effects were random.
Source EMS Lysine 2
2 2 n LSP np LS ns LP nps L Protein 2
2 2 n LSP ns LP nl SP nls P Sex 2
2 2 n LSP np LS nl SP nlps S L*S 2
2 2 n LSP np LS L*P 2
2 2 n LSP ns LP P*S 2
2 2 n LSP nl SP 2 2 n LSP 2 L*P*S Error The residual error is used to test the third order interaction.
The third order interaction is used to test the second order interactions.
Using SAS PROC GLM there is no proper error term for testing the main effects, though one
can be calculated with the “Random / test” statement output. PROC MIXED gives a correct
result. Split-plot and Repeated Measure Designs
The Split-plot and Repeated Measures “Designs” combine elements of design (error structure) and
treatment arrangement concepts. These are designs with two levels, a “Main Plot”, with its own
treatment and error, and a “Sub-plot”, with its own treatment and error.
It is possible to have more than just an a single factor treatment arrangement in both levels.
The (minimum of) two treatments (from the main and sub plots) are usually cross classified .
Either Main or Subplot may have nested error structure.
The simplest split plot would have the following model (CRD).
Yijk 1i ij 2k 1 2 ik ijk
Example with CRD main plot.
A B A C B C B B A C B C A C A James P. Geaghan - Copyright 2011 Statistical Techniques II Page 129 Each plot SPLIT for a new treatment.
G F C
F A B
G F B B B
G F A G G F G F C G F F C G G F F A
G F C
F C G G F G
G F Split-plot design source table. The d.f. for error(b) is the usual t1*t2*(n–1) less the d.f. for
error(a), t1*(n–1), giving t1*(t2–1)(n–1).
Source Treatment1 Error(a) Treatment 2 Tmt1*Tmt2 Error(b) Total d.f. t1–1 = 2 t1 (n–1) = 12 t2–1 = 1 (t1–1)( t2–1) = 2 t1*( t2–1)(n–1) = 12 t1* t2*n–1 = 29 Split-plot design - examples of splits We may split a plot to do a new treatment, e.g. an agricultural experiment with fertilizer
treatments in plots may have a herbicide applied to half of each plot an not to the other half.
A soil study of contaminants may measure levels of the chemical of interest at various levels in
a soil core (0-5 cm, 6-10 cm, 11-15 cm, etc), so the core is split.
A study of the growth of plants, e.g. Spartina in a marsh, may split the plant into above ground,
root and rhizome biomass.
Anytime a treatment occurs within an experimental unit, we have a split-plot. If we are
studying diets of fish, and put a male and female fish in each aquaria, weight gain of hogs with
large and small hogs in each pen, etc.
More complex designs are possible. The main plot may be an RBD, or the main plot and/or sub
plot treatments may be factorial or nested.
It is possible to have plots that are split twice, or split and measured repeatedly.
These designs are complicated, difficult to analyze and difficult to interpret.
So why do you do them?
Split plot design with an RBD main plot.
A B B C f g
d e e d
f g d e
f g d f
e g C
f g A d e
g f B C d e
f g e d
g f B A f e
d g d e
f g A g d
e f C f e
g d This design has two blocks, three levels in the main plot treatment and four levels in the subplot
James P. Geaghan - Copyright 2011 Statistical Techniques II Page 130 For the main plot the analysis is the same as any RBD. This one will have treatments, blocks,
treatment*block interaction and replicated experimental units in blocks.
Yijkl i 1j ij ijk 2l 1 2 il 1 2 ijl ijkl Source table RBD main plot in split-plot.
Source Block Treatment 1 Blk* Tmt1 Error(a) Treatment 2 Tmt1*Tmt2 Blk*Tmt2 + Blk*Tmt1*Tmt2 (pooled) Error(b) Total d.f. calculation b–1 t1–1 (b–1)( t1–1) b t1 (n–1) t2–1 (t1–1)( t2–1) (b–1) ( t2–1) + (b–1)( t1–1)( t2–1) b* t1*( t2–1)(n–1) b* t1* t2*n–1 numeric d.f. 1 2 2 6 3 6 3 + 6 = 9 18 47 Are there advantages to a split plot design? Obviously, if there are covariances, they should be taken into account.
Also, the subplot error is expected to be smaller and have more degrees of freedom. As a
result, subplot tests should be more powerful. This is an advantage if the tests of interest
(treatment and interactions) can be placed in the subplot.
Repeated measures The repeated measures design is similar to a split-plot. We have a “main plot”, which can be any
of the designs we have discussed previously (CRD, RBD, LSD).
We then take repeated measurements over time within the plots. If these “repeated measures”
are independent, then this “time” factor is just cross-classified with the treatment.
If, however, the measurements are NOT independent, we have a repeated measures design.
Independence? Again? Yep.
What do I mean by independent? For example, if you are sampling sugar content of an ear of
corn from a plot, or the height of Spartina in a plot, you ask, “are they independent or not?”
If you measure a different ear of corn from a different plant each time, or measure a different
Spartina plant, they are probably independent.
However, if you measure a kernel from the same ear of corn, or the same Spartina plant each
time (repeatedly), they are likely NOT independent.
Some examples of split plot and repeated measures variables. Pre-post tests on people, in fact most any experiment where several levels of a treatment(s) are
measured on the same subject (= a person).
Soil samples or water samples at different depths (in the same site).
Epiphytes on Spartina counted below, at and above the tide line (on the same plant). James P. Geaghan - Copyright 2011 Statistical Techniques II Page 131 Studies on plants like sugar cane where we measure production in year1, year2 and year3 on
the same biological material.
Ditto for asparagus, artichokes, most tree species, etc.
In general, any time your experimental unit has another treatment applied within each
experimental unit, this is a split plot. If the experimental unit (or sampling unit) is measure
over time it is repeated measures.
Why is this independence important? What can we do about it?
Lets BRIEFLY revisit the X and X'X matrices.
The X matrix for designs consists of columns of 0 values and 1 values, arranged to distinguish
X = 1 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 For a simple CRD with 4 treatment levels the X'X matrix may look like the following.
X'X = n1 0 0 0 0 n2 0 0 0 0 n3 0 0 0 0 n4 For a simple CRD with 4 treatment levels the (X'X)-1 matrix would look like the following.
(X'X)‐1 = 1 n1 0 0 0 0 1 n2 0 0 0 0 1 n3 0 0 0 0 1 n4 To get the variances and covariances we multiply by the MSE, as you know. This gives MSE/ni
on the main diagonal (= Y ), and zeros on the off diagonal.
All those zeros on the off diagonal mean that THERE IS NO COVARIANCE BETWEEN THE
TREATMENTS. This is well and good, we do not expect covariances between the independently
But for the split plot and repeated measures, we do actually expect some covariances!!
Maybe the covariance is simple, perhaps it is a constant. This would be the assumption for
split-plot designs, and we can use GLM for our tests (but not for subplot standard errors).
But much recent scientific investigation has found that often the structure is not simple.
Split-plot SAS example The data come from an a classic experiment to measure the effect of manure on the yield of
barley. Six blocks of three whole plots were used, together with three varieties of barley. Each
James P. Geaghan - Copyright 2011 Statistical Techniques II Page 132 whole plot was divided into four subplots to cater for the four levels of manure: 0, 0.01, 0.02
and 0.04 tons per acre. The data form a completely randomized design.
There is no significant manure level by variety treatment interaction, so the lines below do not
significantly depart from parallel lines. Also, the varieties alone are not significant so the data
could be represented by a single line.
Joined means with standard error bars to examine interaction
130 Yield 120
0 1 2
Manure treatment 3 4 The manure level is quantitative and can be tested for linear, quadratic and cubic effects, but it
is not equally spaced. Othogonal polynomial test show a linear and a quadratic effect.
130 120 Yield 110 100 90 80 70 0 1 2
Manure treatment 3 4 Statistics quote: The Ten Commandments of Statistical Inference (1/14/05,
1. Thou shalt not hunt statistical inference with a shotgun.
2. Thou shalt not enter the valley of the methods of inference without an experimental design.
3. Thou shalt not make statistical inference in the absence of a model.
4. Thou shalt honor the assumptions of thy model.
5. Thy shalt not adulterate thy model to obtain significant results.
6. Thy shalt not covet thy colleagues' data.
7. Thy shalt not bear false witness against thy control group.
8. Thou shalt not worship the 0.05 significance level.
9. Thy shalt not apply large sample approximation in vain.
10. Thou shalt not infer causal relationships from statistical significance.
James P. Geaghan - Copyright 2011 Statistical Techniques II Page 133 See the covariance structure table below. A couple structures of particular interest are the variance
component structure (split plot) and a favorite repeated measure structure AR(1).
Some of the covariance structures available in SAS proc mixed. From SAS Institute Inc., SAS/STAT
software changes and enhancements through release 6.11. Cary, NC, 1996.
Type Option ijth element Structure LM 0 0 0 OP
0 0 0P [I] = M
MM 0 0 0 PP
N0 0 0 Q
LM 0 0 0 OP
MM 0 0 0 PP
MN 0 0 0 0 PQ
LM MM MN LM OP
MM PP MN PQ
LM 1 OP 1 P M
MM 1 PP
N 1 Q
MM PP MN PQ LM 0 0 OP
MM 0 PP MN 0 0 PQ
0 2 Simple structure
No structure no repeated
or split plot 2 for i=j, 2 2 0 otherwise 2 2
components VC 2
2 i for i=j, 2
3 0 otherwise 2
4 2 Compound
Symmetry + 1 for i=j,
CS 1 otherwise two levels of i for i=j,
Unstructured UN symmetric
ij = ji 2
11 2 2
1 21 21
22 31 32 42 2
33 41 2 31 43 AR(1) for i=j,
i-j otherwise 2
2 2 1 OP
PQ 41 42 43
44 2 First-Order
1 3 2 2 2 3 2 2 Toeplitz TOEP for i=j,
|i-j| otherwise 1
2 Toeplitz with two
number of bands) TOEP(2) |i-j| otherwise for a
given number of
0 elsewhere 2 2 1 3 for i=j, 2 3 1
2 1 2 1
2 1 2 1
2 1 1
2 1 1
2 1 James P. Geaghan - Copyright 2011 ...
View Full Document
This note was uploaded on 12/29/2011 for the course EXST 7015 taught by Professor Wang,j during the Fall '08 term at LSU.
- Fall '08