EXST7015 Fall2011 Lect23

EXST7015 Fall2011 Lect23 - Statistical Techniques II Page...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Statistical Techniques II Page 111 tech(laboratory), type(laboratory, tech) and replicate(laboratory, tech, type) TEST H=LAB E=TECH(LAB) / HTYPE=1 ETYPE=1; TEST H=TECH E=TYPE(TECH*LAB) / HTYPE=1 ETYPE=1; Note that SAS GLM will provide you with the EMS with numeric coefficients for the RANDOM statement (with or without the test option. PROC MIXED is a newer alternative to solving this type of problem. This procedure works differently from the usual least squares procedures. It is not a least squares solution, it is an iterative solution (maximum likelihood). It estimates the random variance components (2) instead of the EMS. Then it tests any fixed effect components to the model. PROC MIXED works a little differently from PROC GLM. In PROC MIXED the fixed components ONLY go in the model and the random effects go in the RANDOM statement. There is not a “test” option in PROC MIXED. Are labs fixed or random? They could be either, depending on how or why they were chosen. Confidence limits for the random variance components were requested. Hierarchical design example (Appendix 17b): Snedecor & Cochran, 1980 (pg 293) This is an example of a CRD with different numbers of observations at different levels. Wheat yields were available for 6 districts in the midwest. There were UP TO 10 farms per district and UP TO 3 fields per farm. The model for this design is Yijk i ij ijk In this example the TEST statement was requested for GLM, but it actually gives the wrong answers because the unbalanced design has no clear error term estimated (see EMS coefficients). The TEST option on the RANDOM statement will make a simple algebraic adjust for unbalanced designs. Occasionally negative F tests can result. The TEST statement produced a test of DISTRICTS with a P-value of 0.3056. The TEST option on the RANDOM statement causes the tests to be calculated (with the appropriate error terms). This test also adjusts the tests to account for the unequal coefficients on the EMS. The P-value was 0.4601. The MIXED model again estimates the variance components with confidence intervals. Note that the confidence intervals to not include zero (a negative value should not be possible with variance components), but it is very wide and overlaps with the other variance components. Note that algebraic calculations of the Variance components from GLM give different results. RBD without “replicates” example (Appendix 17c): Snedecor & Cochran, 1980 (pg 256) The experiment tests the failure of soybean seeds to germinate after treatment with one of 4 fungicides and a control. Randomized block design without replication within blocks. Five blocks and four treatment levels in each blocks. James P. Geaghan - Copyright 2011 Statistical Techniques II Page 112 The model for this design is Yij i j ij The error term is the cell to cell variation estimated by the “interaction” term. Since there is no error for testing the interaction we ASSUME that this variation represents only random variation and that there is no real interaction. The GLM tests for this model are correct because the lone error term is used for both of the sources in the model. Note that the output from the test option on the RANDOM statement provides the same results. The coefficients on the BLOCK and TREATMENT EMS are both 5. The 5 blocks are the treatment reps and the 5 treatments are the block reps. The PROC MIXED was fitted with the treatments as random effects. The output gives the same test as the GLM. This is often true for simple designs. A histogram is the only graphic output needed to express the difference between the main effects. RBD with “replicates” example (Appendix 17d): Snedecor & Cochran, 1980 (pg 267) The experiment tests efficacy of fumigants on wire worms (nematodes). There are two fumigants (C and S) and a control (0). There are 5 blocks. This is a Randomized block design with replication within blocks. Five blocks with three treatment levels in each blocks and 4 replicates for each treatment in each block. The model for this design is Yijk i j ( )ij ijk If the replicates are replicated experimental error we can use it to test the interaction. If they are sampling units within plots the test makes less sense. In this case the test was done with the TEST statement instead of the RANDOM statement test option. The TEST statement should be adequate since the appropriate error term can be specified (see EMS) and the design is balanced. The MIXED model analysis of this simple problem will also give the same result. A histogram shows the differences between the main effects. The test of the Fumigant*Block interaction is done with the residual error. This test is provided by default in SAS, and is correct (see EMS). This test indicates that there is a significant difference. If the replicates are samples within plots this is expected, sampling error is expected to be smaller than experimental error. It was noted that there appeared to be a pattern of the means and variances, such that the variance increased as the mean increased. This suggests nonhomogeneous variance, and was addressed by a log transformation. NOTE: In PROC GLM the HOV tests are only available for CRD (One-way ANOVA). The model run on logarithms (Yij+1) was the same. Note that the variance no longer appears nonhomogeneous. The results changed little. The P-value for the arithmetic results was 0.0258 and for the logarithms was 0.0112. The PROC MIXED gave the same result. Although the results did not change much, the second application better meets the assumptions. James P. Geaghan - Copyright 2011 Statistical Techniques II Page 113 Latin Square Example (Appendix17e): Snedecor & Cochran, 1980 (pg 271) The experiment tests for differences in yield of millet for different row spacings. There are 5 spacings at 2, 4, 6 8 and 10 inches. This is a Latin Square Design because the investigators decided that the rows and columns were sufficiently heterogeneous to justify blocking. Since there are 5 treatments we need 5 rows and 5 columns for a Latin Square Design. Since there are now replicated samples within the 25 cells we have only the “interaction” as an error term. The 12 d.f. for the remaining interaction could be obtained by interacting any two components (rows, columns or treatments) if we were to put them in the model. However, since it is the only error term we leave it off the model and let SAS pick it up as the residual error term. This error is the appropriate error for all three main effect components of the model (see EMS). No additional test statements are needed. The PROC MIXED gives exactly the same results for this simple Latin Square Design. A series of Latin Squares is a number of separate Latin Squares with the same treatments in each square. The separate squares are usually blocks, but they could be another treatments. We will discuss experiments with several treatments soon. Since we do not expect ROWS and COLUMNS to necessarily be the same in each square, we nest these effects. We do expect the treatments to be the same in each square, so the treatment and the squares can be factored out as a main effects. We can also get an interaction between the squares and the treatments. The remnants of the remaining interaction are pooled into a single error term. One last unfortunate fact is apparent from the EMS. For higher order designs with more terms we often see that there is no one correct error term. Note that there is not proper error term for the squares. This is not very important if these actually blocks. In GLM the TEST option on the RANDOM statement will calculate an error term for this test. PROC MIXED has no difficulty with this problem. If SQUARES were fixed, it would get a proper error term. Another type of problem is apparent from the PROC MIXED estimates of the variance components. Some variance components cannot be estimated and are set to zero. Examine the comparable components in the PROC GLM. Could you estimate these algebraically? End Design discussion! Statistics quote: Fett's Law: Never replicate a successful experiment. (Prasad, c1prasad@watson.ibm.com) James P. Geaghan - Copyright 2011 Statistical Techniques II Page 114 Overview of ANOVA Recall that we are testing for differences among indicator variables. The treatments may be fixed or random. H0 : 1 2 3 4 ... t for fixed effects. H 0 : 2 0 for random effects. Assume ei ~ NIDrv(0,2). Remember that this covers 3 separate assumptions. Also, assume no block “interactions” for the RBD. Every analysis can be expressed as a model with appropriate notation and subscripting. CRD : Yij i ij For the moment we will be concerned only with examining for differences among the treatment levels. We will assume that we have already detected a significant difference among treatments levels with ANOVA. Treatments levels may be fixed or random. Determining appropriate tests depends on recognizing correctly. With random effects we are probably not interested in individual treatment levels. We are likely to be interested in the variability among the treatment levels and the distribution of the levels. With fixed effects we will probably want to compare individual levels. James P. Geaghan - Copyright 2011 ...
View Full Document

Ask a homework question - tutors are online