EXST7015 Fall2011 Lect18

EXST7015 Fall2011 Lect18 - Statistical Techniques II Page...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Statistical Techniques II Page 76 Analysis of Variance and Experimental Design The simplest model or analysis for Analysis of Variance (ANOVA) is the CRD, the Completely Randomized Design. This model is also called “One-way” Analysis of Variance. Unlike regression, which fits slopes for regression lines and calculates a measure of random variation about those lines, ANOVA fits means and variation about those means. The hypotheses tested are hypotheses about the equality of means H0 : 1 2 3 4 ... t where the i represent means of the levels of some categorical variable “t” is the number of levels in the categorical variable. H1: some i is different We will generically refer to the categorical variable as the “treatment” even though it may not actually be an experimenter manipulated effect. The number of treatments will be designated “t”. The number of observations within treatments will be designated n for a balanced design (the same number of observations in each treatment), or ni for an unbalanced design (for i = 1 to t). The assumptions for basic ANOVA are very similar to those of regression. The residuals, or deviations of observations within groups, should be normally distributed. The treatments are independently sampled. The variance of each treatment is the same (homogeneous variance). ANOVA review I am borrowing some material from my EXST7005 notes on t-test and ANOVA. See those notes for a more complete review of the introduction to Analysis of Variance (ANOVA). Start with the logic behind ANOVA. Prior to R. A. Fisher's development of ANOVA, investigators were likely to have used a series of t tests to test among t treatment levels. What is wrong with that? Recall the Bonferroni adjustment. Each time we do a test we increase the chance of error. To test among 3 treatments we need to do 3 tests, among 4 treatments, 6 tests, 5 treatments are 10 tests, etc. What is needed is ONE test for a difference among all tests with one overall value of a specified by the investigator (usually 0.05). Fisher's solution was simple, but elegant. Suppose we have a treatment with 5 categories or levels. We can calculate a mean and variance for each treatment level. In order to get one really good estimate of variance we can pool the individual variances of the 5 categories (assuming homogeneity of variance). This pooled variance can be calculated as a weighted mean of the variance (weighted by the degrees of freedom). James P. Geaghan - Copyright 2011 Statistical Techniques II Page 77 Y Y A B C D E Group And since SS1 12 n1 1 then n1 1 12 SS1 , the weighted mean is simply the sum of the SS divided by the sum of the d.f. 2 2 2 (n1 1)S1 (n2 1)S2 (n3 1)S3 (n4 1)S2 (n5 1)S5 2 4 S (n1 1) (n2 1) (n3 1) (n4 1) (n5 1) 2 p S2 p SS1 SS2 SS3 SS4 SS5 (n1 1) (n 2 1) (n 3 1) (n 4 1) (n 5 1) So we have one very good estimate of the random variation, or sampling error, S2. Then what? Now consider the treatments. Why don't they all fall on the overall mean? Actually, under the null hypothesis, they should, except for some random variation. So if we estimate that random variation, it should be equal to the same error we already estimated within groups? Recall the variance of means is estimated as S2/n, the variance of the sample divided by the sample size. The standard error is the square root of this. If we actually use means to estimate a variance, we are also estimating the variance Y of means, S2/n. If we multiply this by “n” it should actually be equal to S2, which we estimated with S2 , the pooled variance Y p estimate. So if the null hypothesis is true, the mean square of the deviations within groups should be equal to the mean square of the deviations of the means multiplied by “n”!!!! Deviations within groups Deviations between groups Means A B C D E Group James P. Geaghan - Copyright 2011 Statistical Techniques II Page 78 Now, if the null hypothesis is not true, and some i is different, then what? Then, when we calculate a mean square of deviations of the means from the overall mean, it should be larger than the previously estimated S2 . p Y So we have two estimates of variance, S2 and the p variance from the treatment means. If the null hypothesis is true, they should not be significantly different. Y A B C D E Group If the null hypothesis is FALSE, the treatment mean square should be larger. It will therefore be a ONE TAILED TEST! We usually present this in an “Analysis of Variance” table. Source d.f. Sum of Squares Treatment t–1 SSTreatment Error t(n–1) SSError Total tn–1 SSTotal Mean Square MSTreatment MSError Degrees of freedom There are tn observations total (ni if unbalanced). After the correction factor, there are tn–1 d.f. for the corrected total. There are t–1 degrees of freedom for the t treatment levels. Each group contributes n–1 d.f. to the pooled error term. There are t groups, so the pooled error (MSE) has t(n–1) d.f. The SSTreatments is the SS deviations of the treatment means from the overall mean. Each deviation is denoted ti, and is called a treatment “effect”. SSTreatments = (Y -Y) t 2 t i i=1 2 i i=1 The model for regression is Yi b0 b1 X i ei The effects model for a CRD is Yi i ij where the treatments are i=1, 2, ... t and the observations are j=1, 2, ... n, or ni for unbalanced data An alternative expression for the CRD, called the means model, is Y i i ij Statistics quote: Statistics are like a bikini. What they reveal is suggestive, but what they conceal is vital. -Aaron Levenstein James P. Geaghan - Copyright 2011 Statistical Techniques II Page 79 The calculations. The SSTotal is exactly the same as regression, the sum of all t n Y i 1 j 1 2 ij observations (squared first). The correction factor is exactly the same too, all observations are summed, the sum is t n squared and divided by the number of observations, Yij i 1 j 1 Obs 1 2 ... n sum mean Group 1 Y11 Y12 ... Y1n Y1. Y 1 Group 2 Y21 Y22 ... Y2n Y2. Y2 t n i=1 Group 3 Y31 Y32 ... Y3n Y3. Y3 2 (tn) . Group 4 Y41 Y42 ... Y4n Y4. Y4 2 j=1 UncorrectedSSTreatments = ( Yij ) n Calculations are the same as regression for the corrected sum of squares total. The corrected SS treatments is the uncorrected treatments (calculated from the marginals) less the same correction factor used for the total. Error is usually calculated as the SSTotal minus the SSTreatments. We use an F test to test the equality of two means. An Analysis of Variance usually proceeds with an F test of the MSTreatment using the MSError. The test has t–1 and t(n–1) degrees of freedom. This F test will be ONE TAILED since we expect the treatment variance to be too large if the null hypothesis is not true. The MSError estimate a variance we designate 2 or 2 . If the null hypothesis is true, the MSTreatments estimates the SAME VARIANCE, 2. However, if the null hypothesis is false the MSTreatment variance is the same 2 plus some amount due to the differences between treatments. This is designated 2 n 2 . Since the treatment variance can be designated 2 n 2 , we can see that the null hypothesis can be stated as either the usual H 0 : 1 2 ... t (e.g i2 0 ), or as H 0 : 2 0 . Which is best depends on the nature of the treatment. If the treatment levels are randomly chosen from a large number of treatment levels, then they estimate the variance of that treatment population and would be random. This would be 2 . However, if the treatments are not chosen from a large number of treatments; if they are either all of the levels of interest or all of the levels that exist, then they are said to be FIXED. Fixed treatment levels represent a group of means that are of interest to the investigator, so 2 H 0 1 2 ... t is a better representation of the null hypothesis than H 0 : 0 . James P. Geaghan - Copyright 2011 Statistical Techniques II Page 80 For fixed treatments we still calculate a sum of squared treatment effects and divide by d.f. This 2 is designated ni t 1 and the F test is the same. These simply do not represent a n t 1 variance. t 1 and The two values estimated by the MSTreatment ( n or n 2 2 2 n t 1 2 i MSError ( ) are called expected mean squares. 2 t 1 is often represented as simply Q . n The unwieldy term n t 1 2 i One final note on the F test. Given that MSTreatments and MSError estimate these EMS (expected mean squares) we can 2 n2 rewrite the F test as F 2 . From this value we can see that it must be a one tailed test because n 2 cannot be negative, so the ratio is always >1. We can also see that increasing n increases power. SAS Example (Appendix 12). Summary? Y Means Y Deviations A B C D E Group Overview of ANOVA Recall that we are testing for differences among indicator variables. The treatments may be fixed or random. H 0 : 1 2 ... t for fixed effects. H 0 : 2 0 for random effects. Assume i ~ NIDrv(0,2). Remember that this covers 3 separate assumptions. Statistics quote: If your result needs a statistician then you should design a better experiment. -- Baron Ernest Rutherford James P. Geaghan - Copyright 2011 Statistical Techniques II Page 81 Every analysis can be expressed as a “linear” model with appropriate notation and subscripting. Regression: Yi 0 1 X i i CRD : Yi i ij Factorial : Yi 1i 2 j 1 2 ij ijk RBD : Yi i j ij or Yi i j ij ijk LSD : Yi i j k ijk Split-plot : Yi 1i ij 2 k 1 2 ik ijk Treatments levels may be fixed or random. Determining the correct and appropriate tests depends on recognizing correctly. With random effects we are probably not interested in individual treatment levels. We are likely to be interested in the variability among the treatment levels and the distribution of the levels. With fixed effects we will probably want to compare individual levels. Usual Analysis of variance procedure 1) H0 : 1 2 3 4 ... t 2) H1: some i is different 3) a) Assume that the observations are normally distributed about each mean, or that the residuals (i.e. deviations) are normally distributed. b) Assume that the observations are independent c) Assume that the variances are homogeneous 4) Set the level of type I error. Usually = 0.05 5) Determine the critical value. The test is in ANOVA is a one tailed F test. 6) Obtain data and evaluate the results. 7) Draw your conclusions based on the results Analysis of Variance source table PROC glm DATA=Cuckoo; CLASSES HostSpecies; TITLE2 'Analysis of Variance with PROC GLM'; MODEL EggLt = HostSpecies; run; Source Model Error Corrected Total DF 5 114 119 Sum of Squares 42.9396508 94.2483492 137.188 Mean Square 8.5879302 0.8267399 F Value 10.39 P > F < 0.0001 A more modern version of analysis of variance (mixed model analysis) will be discussed below. It has many options not available in the old least squares approach above. Unfortunaately, many researchers trained in the last century are still unenlightened and use this version of ANOVA. This analysis indicates that the cuckoo egg size used in at least one other species’ nest are different from the size in other cuckoos’ nests. What did we assume? James P. Geaghan - Copyright 2011 Statistical Techniques II Page 82 Descriptions of post-hoc tests Post-hoc or Post-ANOVA tests! Once you have found out some treatment(s) are “different”, how do you determine which one(s) are different? For the moment we will be concerned only with examining for differences among the treatment levels. We will assume that we have already detected a significant difference among treatments levels with ANOVA. So, having rejected the Null hypothesis we wish to determine how the treatment levels interrelate. This is the “post-ANOVA” part of the analysis. These tests fall into two general categories. Post hoc tests (LSD, Tukey, Scheffé, Duncan's, Dunnett's, etc.) A priori tests or pre-planned comparisons (contrasts) A priori tests are better. These are tests that the researcher plans on doing before they gather data, and if we dedicate 1 d.f. to each one we generally feel comfortable doing each at some specified level of alpha. However, since multiple tests do entail risks of higher experiment wide error rates, it would not be unreasonable to apply some technique, like Bonferroni's adjustment, to insure an experimentwise error rate of the desired level of alpha (). So how might we do these “post hoc” tests? The simplest approach would be to do pairwise test of the treatments using something like the two-sample t-test. If you are interested in testing between treatment level means, then you probably have “fixed” effects. If the levels were randomly selected from a large number of possible choices we would probably not be interested in the individual levels chosen. This tests examines the null hypothesis H 0 : 1 2 or H0 : 1 2 0 , against the alternative H0 : 1 2 0 or H0 : 1 2 0 or H0 : 1 2 0 . Recall two things about the two-sample t-test. First, in a t-test we had to determine if the variance was equal for the two populations tested. 2 We tested H 0 : 12 2 with an F test to determine if this was the case. Second, the variance of the test (variance of the difference between 1 and 2) was equal to 2 12 2 n1 n2 . This is a variance for the linear combination from our null hypothesis, that is, the 2 2 2 2 variance of 1 2 is 1 1 (1) 2 , if the variables are independent. If the variance are equal (as they are often assume to be for ANOVA) then the variance is 1 1 2 . We estimate 2 with the mean square error (MSE). n1 n2 James P. Geaghan - Copyright 2011 Statistical Techniques II Page 83 Y1 Y2 So, we would test each pair of means using the two sample t-test as t S ANOVA, using the MSE as our variance estimate, we have t 2 p ( 1 1 n1 n2 (Y1 Y2 ) 1 1 MSE n1 n2 ) . For . If the design is balanced this simplifies this to t (Y1 Y2 ) . 2 M SE n Notice that if the calculated value of t is greater than the tabular value of t, we would reject the null hypothesis. To the contrary, if the calculated value of t is less than the tabular value we would fail to reject. Call the tabular value t*, and write the case for rejection of H0 as t (Y1 Y2 ) So we would reject H0 if t (Y1 Y2 ) (Y1 Y2 ) t 2 MSE n 2 MSE 2 MSE . n or if t 2 MSE (Y1 Y2 ) or n n . So, for any difference (Y1 Y2 ) that is greater than t 2 MSE n we find the difference between the means to be statistically significant (reject H0), and for any value less than this value we find the difference to be consistent with the null hypothesis. Right? This value of t 2 MSE n is what R. A. Fisher called the “Least Significant Difference”, commonly called the LSD (not to be confused with the Latin Square Design = LSD). LSD tcritical MSE (n1 n1 ) 1 2 or LSD tcritical SY1 Y2 This value is the exact width of an interval Y1 Y2 which would give a t-test equal to tcritical. Any larger values would be “significant” and any smaller values would not. This LSD tcritical SY Y is called the “Least Significant Difference”. 1 2 We calculate this value for each pair of differences and if the observed difference is less, the treatments are “not significantly different”. If greater they are “significantly different”. One last detail. I have used the simpler version of the variance assuming that n1 = n2. If the experiment is unbalanced (i.e. there are unequal numbers of observations in the treatment levels) then the value is MSE 1 1 . n1 n2 James P. Geaghan - Copyright 2011 ...
View Full Document

This note was uploaded on 12/29/2011 for the course EXST 7015 taught by Professor Wang,j during the Fall '08 term at LSU.

Ask a homework question - tutors are online