EXST7015 Fall2011 Lect20

EXST7015 Fall2011 Lect20 - Statistical Techniques II Page...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Statistical Techniques II Page 88 Comparison of ranked means works very well if the analysis is balanced. If the analysis is not balanced there can be a problem. It is possible that means that are close together are significantly different, while means that have a greater difference are not significantly different. Where variance = MSE 1 1 and MSE = 25 n n tmt 1 2 3 mean 18 13 12 1 n 5 100 5 2 test 1v2 2 v3 1v3 diff 5 1 6 se t value 2.2913 2.1822 2.2913 0.4364 3.1623 1.8974 d.f. 103 103 8 P value 0.02398 0.66343 0.09435 For unbalanced tests the best way to check for difference is to calculate a confidence interval for each mean and see if the confidence intervals overlap. By default, SAS will use this approach for unbalanced means. Post-ANOVA tests Having rejected the Null hypothesis in Analysis of Variance we would usually wish to determine how the treatment levels differ from each other. This is the “post-ANOVA” part of the analysis. These tests fall into two general categories. We have already discussed the post hoc tests (LSD, Tukey, Scheffé, Duncan's, Dunnett's, etc.) . These tests are often (usually?) done with no a priori hypotheses in mind. That means we do not have any particular comparisons in mind before doing the experiment; we want to examine many, or all, levels of the treatments for differences from one another, and each test is done with a probability of error. The use of an experimentwise error rate is intended to permit these a posteriori comparisons without inflating the error rate for the analysis. We will now discuss a priori tests or pre-planned comparisons (contrasts). These a priori tests are better in many ways because the researcher plans on doing particular tests before the data is gathered. If we dedicate 1 d.f. to each one we generally feel comfortable doing each test at some specified level of alpha, usually 0.05. However, since multiple tests do entail risks of higher experiment wide error rates, it would not be unreasonable to apply some technique, like Bonferroni's adjustment, to insure an experimentwise error rate of the desired level of alpha (). When we want some lesser number of comparisons, and they are determined a priori (without looking at the data), then we can use a less stringent criteria. We generally feel comfortable with one test per degree of freedom at some specified level of alpha (), just as we did in regression (looking at each regression coefficient with an a level of error). James P. Geaghan - Copyright 2011 Statistical Techniques II Page 89 Contrasts Recall out discussion of linear combinations from Multiple Regression. The linear combination Ai aX i bYi cZi Has a variance given by, 2 2 2 2 Var Ai a 2 X i b 2 Y2i c 2 Z i 2 ab X i ,Yi ac X i , Z i bc Y2i , Z i Var Ai a 2 * Var X i b 2 * Var Yi c 2 * Var Z i 2 * Covariances However, if the variables are independent we can assume all covariances are zero. Var Ai a 2 * Var X i b 2 * Var Yi c 2 * Var Z i In multiple regression we did not assume that the regression coefficients were independent. However, in ANOVA we do consider the levels of a treatment to be independent. Suppose we want to test the mean of two groups against the mean of 3 other groups. 1 2 H0: or or ( 1 2 2 1 2 1 3 4 5 ) ( 1 2 2 3 1 3 or 1 2 2 1 1 3 3 3 4 5 3 ) 3 4 5 0 ( ) ( ) ( ) 1 1 2 2 b g 1 3 0 1 3 b g 3 1 4 3 b g 5 0 or 3 1 3 2 2 3 2 4 2 5 0 The variance of a mean is 2/n. In ANOVA all of the 2 are equal to MSE. The n may or may not be equal. Since we do not need the covariances we can calculate the variance of the linear combination, 1 2 2 1 ( ) ( ) ( ) 1 2 2 1 3 1 3 2 1 4 3 3 5 2 as 2 2 1 MSE 1 MSE 1 MSE 1 MSE 1 MSE 2 n1 2 n2 3 n3 3 n4 3 n5 1 1 MSE 4 n1 4 n2 1 9n 3 1 9 n4 1 9n 5 If the design is balanced (i.e. the sample sizes are equal) this simplifies to MSE 1 1 1 1 1 MSE 2 3 MSE 1 1 MSE 5 = n 4 4 9 9 9 n 4 9 n 2 3 n 6 This calculation of variance can be done for any linear combination (contrast) that we want to test. Of course, SAS can do this for us. James P. Geaghan - Copyright 2011 Statistical Techniques II Page 90 Note that we already saw this for the two sample t–test as t (Y1 Y2 ) 1 1 MSE n1 n2 and if the design is balanced this simplifies this to t (Y1 Y2 ) . This is still true comparing for 2 2 M SE n means in analysis of variance (e.g. the LSD). We also saw another type of application for linear combinations. If you want to test an hypothesis between two or more independent estimates like, H 0 : 1 0 .5 2 or H 0 : 1 0 .5 2 0 We note that since these are independent, the variance for this t-test will be 2 MSE MSE MSE 1 and for equal sample size this reduces to 1.25 1 Var ( 1 ) Var ( 2 ) n n1 4 n2 2 2 The general formula for contrasts, H 0 : m11 m2 2 m3 3 ... mt t 0 , where we will call the mi the “multipliers” and estimate i with Y and the variance with MSE. The test statistic: t i 1 miYi mi2 The Variance: MSE i 1 or ni MSE t i 1 mi2 for a balanced experiment (i.e. ni equal). n t So all we need are the multipliers. These are often called the “contrast”. I prefer these as integers. Note for the example we have examined, 1 1 ( ) ( ) ( ) H 0: 1 2 2 2 1 3 1 3 3 1 4 3 5 0 Can be multiplied through by 6 yielding the hypothesis, ( ) ( ) ( ) H 0: 31 3 2 2 3 2 4 2 5 0 The multipliers are 3, 3, –2, –2, –2 instead of 1/2, 1/2, –1/3, –1/3, –1/3. One final note. If we calculate our “contrasts” as above without the “MSE” in the denominator, then we calculate Q k ai Ti i 1 k n ai2 , without the MSE, then all that would remain to complete i 1 the t-test is to divide by M S E . The value called “Q”, when divided by M S E gives a t statistic. If we calculate Q2 and divide by MSE we get an F statistic. SAS uses F tests. All we need provide SAS is the values of “a”, the coefficients, in the correct order, and it will calculate and test the “Contrast” with an F statistic. James P. Geaghan - Copyright 2011 Statistical Techniques II Page 91 Coming up with the multipliers. The multipliers should be a priori contrasts of interest to the investigator. As such, there are no “right or wrong” contrasts as long as they conform to one basic rule. The multipliers must add to zero. Let’s suppose we are measuring hemoglobin concentrations in the blood, and are interested in the following animals. Cats Dogs Humans Whales (Pilot) Fish (perch) Sharks (Dogfish) Birds (Chicken) Now let’s come up with a contrast that compares terrestrial to aquatic (e.g. Cats, Dogs, Humans and Birds versus Whales, Fish and Sharks). Terrestrial Cats Dogs Humans Birds (Chicken) Aquatic Whales (Pilot) Fish (perch) Sharks (Dogfish) We compare the mean of 4 means to the mean of 3 means, H0 : Cats Dogs Humans Birds Whales Fish Sharks 0. 4 3 or, using multipliers of 1/4 and 1/3. H0 : 1 1 1 1 1 1 1 Cats Dogs Humans Birds Whales Fish Sharks = 0 4 4 4 4 3 3 3 and if we multiply through by 12 we can get integers. H 0 : 3C ats 3D o g s 3H u m an s 3B ird s 4 W h ales 4 F ish 4S h ark s = 0 Make sure that one set of multipliers for one of the groups is negative so the multipliers sum to zero. The multipliers are then Tmt Level Cats Dogs Humans Whales Fish Sharks Birds multiplier 3 3 3 –4 –4 –4 3 What if we wanted to compare mammals to others? We are still comparing 4 things to 3, where on group is negative. Mammals Cats Dogs Humans Whales Others Fish Sharks Birds James P. Geaghan - Copyright 2011 Statistical Techniques II Page 92 The multipliers are Tmt Level multiplier Cats 3 Dogs 3 Humans 3 Whales 3 Fish –4 Sharks –4 Birds –4 How about comparing Humans to others? Now we are comparing 1 thing to 6. Give the 6 a coefficient of 1 and the 1 a coefficient of 6 and make one set negative. Us Them Cats Dogs Whales Fish Sharks Birds Humans The multipliers are Tmt Level multiplier Cats 1 Dogs 1 Humans –6 Whales 1 Fish 1 Sharks 1 Birds 1 One final comparison. How about scaled to unscaled. Comparing 5 things to 2. The two get a 5, the 5 get a 2. One is negative. Scaled Fish Sharks Unscaled Humans Cats Dogs Whales Birds The multipliers are Tmt Level multiplier Cats 2 Dogs 2 Humans 2 Whales 2 Fish –5 Sharks –5 Birds 2 Our contrasts of interest are Tmt Level Cats Dogs Humans Whales Fish Sharks Birds Ter/Aq 3 3 3 –4 –4 –4 3 Mam 3 3 3 3 –4 –4 –4 Hum 1 1 –6 1 1 1 1 Scale 2 2 2 2 –5 –5 2 Contrasts multiplier lists may be expressed vertically or horizontally. SAS code would usually be expressed as a horizontal list of multipliers. Contrast Ter/Aq Mam Hum Scale Cats 3 3 1 2 Dogs 3 3 1 2 Humans 3 3 ‐6 2 Whales ‐4 3 1 2 Fish ‐4 ‐4 1 ‐5 Sharks ‐4 ‐4 1 ‐5 Birds 3 ‐4 1 2 James P. Geaghan - Copyright 2011 Statistical Techniques II Page 93 Another Example: where Yi = weekly mean pounds of laundry Treatment levels: His, Hers, Ours Tmt Level His Hers Ours His v Hers H&H v Ours First contrast His and Hers combined versus Ours. H0 : HIS Hers 2 OURS or H0 : 0.5HIS 0.5Hers OURS or H0 : 0.5HIS 0.5Hers OURS 0 or H0 : HIS Hers 2OURS 0 Then contrast His to Hers (leave out ours). H 0 : HIS Hers or H0 : HIS Hers 0 Tmt Level His Hers Ours His v Hers 1 –1 0 H&H v Ours 1 1 –2 Note: use multiplier 0 to omit a mean. We have discussed the t test of contrasts. These can be done either as t tests of F tests (where t2 = F). However, some “joint” contrasts may take more than 1 d.f., and these must be done as F tests. SAS will use F tests since these can do either type of test. For the fungicide example of a block design, suppose we had certain test we wanted to perform. First, compare the CHECK treatment to the others. Then compare among selected fungicides one or two against the others. Contrasts in SAS require only the multipliers for calculation. Made-up contrast statements for the fungicide example. Contrasts results. Note that each test has an % chance of error. This is a comparison wise error rate, but this is usually considered acceptable if the user does not do more than one test for each treatment degree of freedom. Statistics quote: Smoking is one of the leading causes of statistics. -- Fletcher Knebel Contrast example For the wire worm fumigant example, suppose we wanted to test for the effect of fumigants (on the average) against the control, and we wanted to test for a difference between the two fumigants. No other tests were of interest. James P. Geaghan - Copyright 2011 Statistical Techniques II Page 94 The SAS statements are 79 80 81 PROC mixed DATA=FUMIGANT cl; CLASSES FUMIGANT BLOCK REP; TITLE3 'ANOVA with PROC MIXED - RBD with reps'; MODEL WORMS = FUMIGANT / htype=3 DDFM=Satterthwaite outp=ResidDataP outpM=ResidDataPM; RANDOM BLOCK FUMIGANT*BLOCK; *** FUMIGANT levels ---------------0 C S; CONTRAST 'Control v others' FUMIGANT -2 1 1; CONTRAST 'C v S' FUMIGANT 0 -1 1; 82 83 84 85 Note that in all tests with this example (including other post-ANOVA an error term must be specified). This is not necessary in PROC MIXED. The contrast results are Type 3 Tests of Fixed Effects Effect Num DF Den DF F Value FUMIGANT 2 8 5.98 Contrasts Label Control v others C v S Num DF 1 1 Den DF 8 8 Pr > F 0.0258 F Value 11.88 0.08 Pr > F 0.0087 0.7812 The interpretation is clear; the mean of the two fumigants is different from the control, and the two fumigants are not significantly different from each other. Orthogonality Two things I mentioned about contrasts. They must sum to zero They test a contrast of interest to the investigator, so there is not necessarily right or wrong contrast (as long as they test what you intend). There is one other feature of contrasts called orthogonality. When you do a contrast you basically factor off a part of a treatment SS and test thatcontrast. If you have 5 d.f. in the treatment, and you do 5 contrasts, will they add up to more, less or equal to the treatment SS? It depends. Like TYPE III SS (which contrasts are) they can add up to more or less. However, if the contrast are orthogonal they will add up to exactly equal the treatment SS. Orthogonality exists when all pairwise crossproducts sum to zero. Tmt Level His Hers Ours His v Hers 1 –1 0 H&H v Ours 1 1 2 Crossproduct 1 –1 0 If there are 3 contrasts you must calculate all 3 pairwise crossproducts, and if any one does not sum to zero, the contrasts together are not orthogonal. For 4 contrasts, there are 6 pairwise crossproducts, etc. James P. Geaghan - Copyright 2011 Statistical Techniques II Page 95 Orthogonal Polynomial Contrasts (see Appendix 15) In cases where treatment levels are quantitative, the test of a linear or quadratic trend may be of interest. Tables of contrast multipliers that test polynomial trends are available for equally spaced treatments. It is feasible to get multipliers for treatment levels that are NOT equally spaced, but there are an infinite number of such treatments and tables are not available. Multipliers can be obtained from a SAS IML function called ORPOL (see Appendix 15). Note that these sets of multipliers that do polynomial trends are orthogonal, so they can be placed in any order. In a balanced design they should not change regardless of the order used. The tables are available in many statistics textbooks (and below). Orthogonal Polynomial multipliers (equally spaced X) Levels = 3 X l 1 -1 2 0 3 1 q 1 -2 1 X 1 2 3 4 Levels = 5 q 2 -1 -2 -1 2 c -1 2 0 -2 1 q 1 -4 6 -4 1 Levels = 7 1 -3 2 -2 3 -1 4 0 5 1 6 2 7 3 5 0 -3 -4 -3 0 5 -1 1 1 0 -1 -1 1 3 -7 1 6 1 -7 3 -1 4 -5 0 5 -4 1 7 1 -3 -5 -5 -3 1 7 -7 5 7 3 -3 -7 -5 7 7 -13 -3 9 9 -3 -13 7 -7 23 -17 -15 15 17 -23 7 1 -5 9 -5 -5 9 -5 1 c -1 3 -3 1 1 -6 15 -20 15 -6 1 Levels = 8 1 -7 2 -5 3 -3 4 -1 5 1 6 3 7 5 8 7 Levels = 4 l q -3 1 -1 -1 1 -1 3 1 X 1 2 3 4 5 l -2 -1 0 1 2 1 2 3 4 5 6 -5 -3 -1 1 3 5 Levels = 6 5 -5 -1 7 -4 4 -4 -4 -1 -7 5 5 1 -3 2 2 -3 1 -1 5 -10 10 -5 1 -1 7 -21 35 -35 21 -7 1 James P. Geaghan - Copyright 2011 Statistical Techniques II Levels = 9 1 -4 2 -3 3 -2 4 -1 5 0 6 1 7 2 8 3 9 4 28 7 -8 -17 -20 -17 -8 7 28 Page 96 -14 7 13 9 0 -9 -13 -7 14 14 -21 -11 9 18 9 -11 -21 14 -4 11 -4 -9 0 9 4 -11 4 4 -17 22 1 -20 1 22 -17 4 -1 6 -14 14 0 -14 14 -6 1 1 -8 28 -56 70 -56 28 -8 1 For levels of X that are not equally spaced there is a SAS IML instruction that will produce the orthogonal polynomial multipliers. The following statements will do this if you have SAS IML available. OPTIONS PS=60 LS=78; PROC IML; RESET PRINT; X={1 , 2 , 3 , 4 , 8}; O=ORPOL(X,3); RUN; QUIT; where the X vector gives the levels of the quantitative variable. The orpol function needs one parameter specifying the name of the quantitative variable vector and a second parameter specifying the number of orthogonal polynomials levels desired. When fitted, these one d.f. contrasts are interpreted very much like slopes on a polynomial regression (except TYPE I SS are not needed since they are orthogonal). Example: test Millet yield example for quantitative trends. Treatment was row spacing of 2, 4, 6, 8 and 10 inches. 125 126 127 128 129 130 131 132 133 PROC MIXED DATA=MILLET cl; CLASSES ROW COLUMN Spacing; TITLE3 'ANOVA with PROC MIXED - Latin Square'; MODEL YIELD = Spacing / htype=3 DDFM=Satterthwaite outp=ResidDataP; RANDOM ROW COLUMN; *** Row spacing levels ------A B C D E; CONTRAST 'Linear ' Spacing -2 -1 0 1 2; CONTRAST 'Quadratic' Spacing 2 -1 -2 -1 2; CONTRAST 'Cubic ' Spacing -1 2 0 -2 1; CONTRAST 'Quartic ' Spacing 1 -4 6 -4 1; Statistics quote: "Like other occult techniques of divination, the statistical method has a private jargon deliberately contrived to obscure its methods from non-practitioners." -- G. O. Ashley James P. Geaghan - Copyright 2011 Statistical Techniques II Page 97 Millet yield ANOVA. Type 3 Tests of Fixed Effects Effect Num DF Den DF F Value Spacing 4 12 0.98 Pr > F 0.4523 Contrasts Label Linear Quadratic Cubic Quartic Pr > F 0.0766 0.8713 0.7178 0.8860 Num DF 1 1 1 1 Den DF 12 12 12 12 F Value 3.75 0.03 0.14 0.02 Yield 275 Millet Yield on row spacing 270 265 260 255 250 245 240 235 230 0 2 4 6 Row spacing 8 10 12 Millet yield ANOVA done in GLM. Source ROW COLUMN TREATMNT Contrast Linear Quadratic Cubic Quintic DF 4 4 4 DF 1 1 1 1 Type III SS 13601.36000 6146.16000 4156.56000 Contrast SS 3960.500000 28.928571 144.500000 22.631429 Mean Square 3400.34000 1536.54000 1039.14000 F Value 3.22 1.46 0.98 Pr > F 0.0516 0.2758 0.4523 Mean Square F Value Pr > F 3960.500000 3.75 0.0766 28.928571 0.03 0.8713 144.500000 0.14 0.7178 22.631429 0.02 0.8860 Note that contrasts sum to the treatment SS. Summary A priori contrasts are usually preferred because they use fewer d.f. than other post-hoc tests and address research hypotheses directly. Contrasts are linear combinations of the treatment level means. Single degree of freedom tests could be done as t-tests. Multiple degree of freedom tests are possible, but use the F test. Orthognonal tests will sum to the treatment sum of squares. Non-orthogonal contrasts will not, but this is not a problem as long as they test the hypotheses of interest. Quantitative treatments should be addressed with orthogonal polynomial contrasts. James P. Geaghan - Copyright 2011 ...
View Full Document

This note was uploaded on 12/29/2011 for the course EXST 7015 taught by Professor Wang,j during the Fall '08 term at LSU.

Ask a homework question - tutors are online