This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Final Exam Practice SOlutions–ANOVA and Logistic Regression ANOVA (1) ANOVA Basics: (a) The completed ANOVA table is shown below. The degrees of freedom must add so we have that the within group degrees of freedom is 28. We know that mean squares are sums of squares divided by degrees of freedom. Sence SSB/3 = 16.42 for the between groups, or SSB =(3)(16.42) = 49.26. Similarly, MSW = SSW/DF = 102.5/28 = 3.66. We also know that the sums of squares add so SST = SSB+SSW = 49.26+102.5 = 151.76. FInally, F = MSB/MSW = 16.42/3.66 = 4.49. Source df SS MS F Between (Treatment) 3 49.26 16.42 4.49 Within (Error) 28 102.50 3.66 Total 31 151.76 (b) We know that the degrees of freedom between groups is just the number of groups minus 1. This comes from the fact that in a regression we would only need G1 indicator variables to cover all the groups with the remaining group as the reference. Thus there are G=4 treatment groups here. Note that the term “treatment” refers to the group label. This terminology comes from the fact that in medical studies the groups often represent the different treatments or drug dosages that are given to patients and whose efficacy the study seeks to compare. However a treatment could also be something like gender or smoking status. (c) It is possible the ANOVA table comes from abalanced randomized design. A balanced de sign would have the same number of points in eachg roup. Here there are 4 groups and a total of n=32 data points. The latter we know because the total degrees of freedom was 31 in the ANOVA table and we know that this is just n1. With 32 points and 4 groups we could have 8 points per group–a balanced design. Of course, the design could have been unbalancedwe can’t tell for sure from just the table–and we know nothing at all about whetehr it was really randomized. (d) The pvalue we want is P ( F 3 , 28 ≥ 4 . 49) . From the F table in your text (which you are NOT re sponsible for using) the closest number of degrees of freedom is 3 and 30. We see that F 3 , 30 ,. 99 = 4 . 51 which is almost exactly our value so we conclude the pvalue is very close to .01. Using a significance level of α = . 05 we would reject H and conclude that at least some of the means in this problem (whatever it is about!) are different. If we wanted to use STATA to get the exact pvalue we would type di Ftail(3,28,4.49) The “di” stands for display. “Ftail” tells STATA to give us the probability of being bigger than the value indicated. The first two numbers in parentheses are the degrees of freedom and the final number is the F statistic. Note that this approach is better than the one I originally posted witht 1 eh homework! The resulting pbalue is .01077 which is very close to our estimate from the table....
View
Full Document
 Winter '07
 Sugar
 Statistics, Normal Distribution, Statistical hypothesis testing, vegetarian diets, protein intake

Click to edit the document details