This preview shows page 1. Sign up to view the full content.
Unformatted text preview: Statistical Techniques II Page 111 tech(laboratory), type(laboratory, tech) and replicate(laboratory, tech, type)
TEST H=LAB E=TECH(LAB) / HTYPE=1 ETYPE=1;
TEST H=TECH E=TYPE(TECH*LAB) / HTYPE=1 ETYPE=1;
Note that SAS GLM will provide you with the EMS with numeric coefficients for the
RANDOM statement (with or without the test option.
PROC MIXED is a newer alternative to solving this type of problem. This procedure works
differently from the usual least squares procedures.
It is not a least squares solution, it is an iterative solution (maximum likelihood).
It estimates the random variance components (2) instead of the EMS.
Then it tests any fixed effect components to the model.
PROC MIXED works a little differently from PROC GLM. In PROC MIXED the fixed
components ONLY go in the model and the random effects go in the RANDOM statement.
There is not a “test” option in PROC MIXED.
Are labs fixed or random? They could be either, depending on how or why they were chosen.
Confidence limits for the random variance components were requested.
Hierarchical design example (Appendix 17b): Snedecor & Cochran, 1980 (pg 293) This is an example of a CRD with different numbers of observations at different levels. Wheat
yields were available for 6 districts in the midwest. There were UP TO 10 farms per district and
UP TO 3 fields per farm.
The model for this design is Yijk i ij ijk
In this example the TEST statement was requested for GLM, but it actually gives the wrong
answers because the unbalanced design has no clear error term estimated (see EMS coefficients).
The TEST option on the RANDOM statement will make a simple algebraic adjust for
unbalanced designs. Occasionally negative F tests can result.
The TEST statement produced a test of DISTRICTS with a Pvalue of 0.3056.
The TEST option on the RANDOM statement causes the tests to be calculated (with the
appropriate error terms).
This test also adjusts the tests to account for the unequal coefficients on the EMS. The Pvalue
was 0.4601.
The MIXED model again estimates the variance components with confidence intervals. Note that
the confidence intervals to not include zero (a negative value should not be possible with variance
components), but it is very wide and overlaps with the other variance components.
Note that algebraic calculations of the Variance components from GLM give different results.
RBD without “replicates” example (Appendix 17c): Snedecor & Cochran, 1980 (pg 256) The experiment tests the failure of soybean seeds to germinate after treatment with one of 4
fungicides and a control.
Randomized block design without replication within blocks. Five blocks and four treatment levels
in each blocks.
James P. Geaghan  Copyright 2011 Statistical Techniques II Page 112 The model for this design is Yij i j ij
The error term is the cell to cell variation estimated by the “interaction” term. Since there is no
error for testing the interaction we ASSUME that this variation represents only random variation
and that there is no real interaction.
The GLM tests for this model are correct because the lone error term is used for both of the
sources in the model.
Note that the output from the test option on the RANDOM statement provides the same results.
The coefficients on the BLOCK and TREATMENT EMS are both 5. The 5 blocks are the
treatment reps and the 5 treatments are the block reps.
The PROC MIXED was fitted with the treatments as random effects. The output gives the same
test as the GLM. This is often true for simple designs.
A histogram is the only graphic output needed to express the difference between the main effects.
RBD with “replicates” example (Appendix 17d): Snedecor & Cochran, 1980 (pg 267) The experiment tests efficacy of fumigants on wire worms (nematodes). There are two fumigants
(C and S) and a control (0).
There are 5 blocks.
This is a Randomized block design with replication within blocks. Five blocks with three
treatment levels in each blocks and 4 replicates for each treatment in each block.
The model for this design is Yijk i j ( )ij ijk
If the replicates are replicated experimental error we can use it to test the interaction. If they
are sampling units within plots the test makes less sense.
In this case the test was done with the TEST statement instead of the RANDOM statement test
option. The TEST statement should be adequate since the appropriate error term can be
specified (see EMS) and the design is balanced.
The MIXED model analysis of this simple problem will also give the same result.
A histogram shows the differences between the main effects.
The test of the Fumigant*Block interaction is done with the residual error. This test is provided
by default in SAS, and is correct (see EMS).
This test indicates that there is a significant difference. If the replicates are samples within
plots this is expected, sampling error is expected to be smaller than experimental error.
It was noted that there appeared to be a pattern of the means and variances, such that the variance
increased as the mean increased. This suggests nonhomogeneous variance, and was addressed by
a log transformation.
NOTE: In PROC GLM the HOV tests are only available for CRD (Oneway ANOVA).
The model run on logarithms (Yij+1) was the same. Note that the variance no longer appears
nonhomogeneous. The results changed little. The Pvalue for the arithmetic results was 0.0258
and for the logarithms was 0.0112.
The PROC MIXED gave the same result.
Although the results did not change much, the second application better meets the assumptions.
James P. Geaghan  Copyright 2011 Statistical Techniques II Page 113 Latin Square Example (Appendix17e): Snedecor & Cochran, 1980 (pg 271) The experiment tests for differences in yield of millet for different row spacings. There are 5
spacings at 2, 4, 6 8 and 10 inches.
This is a Latin Square Design because the investigators decided that the rows and columns were
sufficiently heterogeneous to justify blocking.
Since there are 5 treatments we need 5 rows and 5 columns for a Latin Square Design.
Since there are now replicated samples within the 25 cells we have only the “interaction” as an
error term.
The 12 d.f. for the remaining interaction could be obtained by interacting any two components
(rows, columns or treatments) if we were to put them in the model.
However, since it is the only error term we leave it off the model and let SAS pick it up as the
residual error term.
This error is the appropriate error for all three main effect components of the model (see EMS).
No additional test statements are needed.
The PROC MIXED gives exactly the same results for this simple Latin Square Design.
A series of Latin Squares is a number of separate Latin Squares with the same treatments in each
square. The separate squares are usually blocks, but they could be another treatments. We will discuss
experiments with several treatments soon.
Since we do not expect ROWS and COLUMNS to necessarily be the same in each square, we nest
these effects. We do expect the treatments to be the same in each square, so the treatment and the
squares can be factored out as a main effects.
We can also get an interaction between the squares and the treatments.
The remnants of the remaining interaction are pooled into a single error term.
One last unfortunate fact is apparent from the EMS. For higher order designs with more terms we
often see that there is no one correct error term. Note that there is not proper error term for the
squares. This is not very important if these actually blocks.
In GLM the TEST option on the RANDOM statement will calculate an error term for this test.
PROC MIXED has no difficulty with this problem. If SQUARES were fixed, it would get a proper
error term.
Another type of problem is apparent from the PROC MIXED estimates of the variance
components. Some variance components cannot be estimated and are set to zero. Examine the
comparable components in the PROC GLM. Could you estimate these algebraically?
End Design discussion! Statistics quote: Fett's Law: Never replicate a successful experiment. (Prasad, c1prasad@watson.ibm.com) James P. Geaghan  Copyright 2011 Statistical Techniques II Page 114 Overview of ANOVA
Recall that we are testing for differences among indicator variables.
The treatments may be fixed or random. H0 : 1 2 3 4 ... t for fixed effects.
H 0 : 2 0 for random effects. Assume ei ~ NIDrv(0,2). Remember that this covers 3 separate assumptions.
Also, assume no block “interactions” for the RBD.
Every analysis can be expressed as a model with appropriate notation and subscripting.
CRD : Yij i ij
For the moment we will be concerned only with examining for differences among the treatment
levels. We will assume that we have already detected a significant difference among treatments
levels with ANOVA.
Treatments levels may be fixed or random. Determining appropriate tests depends on recognizing
correctly. With random effects we are probably not interested in individual treatment levels. We
are likely to be interested in the variability among the treatment levels and the distribution of the
levels. With fixed effects we will probably want to compare individual levels. James P. Geaghan  Copyright 2011 ...
View
Full
Document
 Fall '08
 Wang,J

Click to edit the document details