This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Lecture 33 Nancy Pfenning Stats 1000 Chapter 16: Analysis of Variance Last time, we wanted to test if the difference among 3 observed mean test scores82, 66, and 60could be easily enough attributed to chance variation: H : 1 = 2 = 3 vs. H a : not all the i are equal. To express H a with mathematical notation would be too awkward: one would have to write H a : 1 = 2 = 3 or 1 = 3 = 2 or 2 = 3 = 1 or 1 = 2 = 3 Our test statistic F is the ratio of variation among means MSG to variation within groups MSE . If this ratio is large, we have evidence that the means differ and we will reject H . An ANOVA Table organizes the calculations needed to perform the F test. For our exam problem, we have I = 3 groups, N = 20 observations. Source refers to source of variation and df are the degrees of freedom. Source df Sum of Squares Mean Sum of Squares F Pvalue Group DFG = I 1 = 2 SSG = 1720 MSG = 1720 2 = 860 MSG MSE = 4 . 5 in (.025, .05) Error DFE = N I = 17 SSE = 3245 MSE = 3245 17 = 191 Total N 1 = 19 Under the null hypothesis of equal population means, the F statistic has a distribution with I 1 df in the numerator and N I df in the denominator; in this example, it is F (2 , 17). The Pvalue is the probability that, assuming the null hypothesis is true, an F (2 , 17) R.V. would take a value at least as large as the one observed: Pvalue = P ( F 4 . 5) Consulting the F tables, page 586 shows F critical values for 2 df in the numerator, and 15 or 20 in the denominator. To be conservative, we will use 15 (slightly smaller critical values make it a little more difficult to reject the null hypothesis). Since 4.5 is between 3.68 and 4.77, the Pvalue is between .025 and .050. This is, in general, small enough to reject H . We conclude that the difference in observed mean scores is unlikely to be a result of chance variation; rather, we have evidence that the three exams did not share the same level of difficulty. In this course, we take the analysis no further. In practice, more detailed comparisons called contrasts can be made to pinpoint which means differ. For example, we may be able to show that there is only a significant difference between the first and the second two, not between those twoin other words, maybe the first exam was less difficult and the other two were comparable. Example Check if mean earnings could be equal for all 1st, 2nd, 3rd, 4th, and other year Pitt students. Exercise: Compare values of a quantitative survey variable for more than two categorical groups by carrying out an ANOVA test in MINITAB. State your conclusions in terms of the particular variables chosen. 145 Lecture 34 Chapter 15: More About Categorical Variables Example Results of a labor survey in March 1988 for an SRS of 914 California men aged 35 to 44 are shown below: Married? Employed Unemployed Total Proportion Employed Currently 638 27 665 p 1 = 638 665 = . 959 Previously 133 8 141 p 2...
View
Full
Document
This note was uploaded on 02/15/2012 for the course STAT 1000 taught by Professor Taeyoungpark during the Fall '06 term at Pittsburgh.
 Fall '06
 taeyoungpark
 Variance

Click to edit the document details