420Hw05ans

Course: STAT 420, Spring 2012
School: UIllinois
420 Spring STAT 2012 Homework #5 (due Friday, February 24, by 4:00 p.m.) 1. For the prostate data, fit a model with lpsa as the response and the other variables as predictors. a) Plot the residuals vs. the fitted values. Check the constant variance assumption for the errors. ( 4.3 (a) ) > > > > > library(faraway) data(prostate) fit = lm(lpsa~lcavol+lweight+age+lbph+svi+lcp+gleason+pgg45) plot(fit\$fitted.values,fit\$residuals) abline(h=0,lty=2) The residuals look quite random. There's no clear evidence for a non-constant variance. b) Make a histogram and a Normal Q-Q plot for the residuals. Check the normality assumption for the errors. ( 4.3 (b) ) > hist(fit\$residuals) > qqnorm(fit\$residuals) There's a little evidence for non-normality, but it's not outstanding. Normality assumption seems to be fine. > shapiro.test(fit\$residuals) Shapiro-Wilk normality test data: fit\$residuals W = 0.9911, p-value = 0.7721 2. A society of bird watchers has collected data from several towns on stork sighting ( x ) and human births ( y ) to test the widely expressed belief that storks bring babies. Assume that ( X , Y ) have a bivariate normal distribution. The data are given in the table below: Storks, x 18 16 10 20 14 26 22 Babies, y 27 15 13 21 19 39 27 x = 126, y = 161, x 2 = 2,436, y 2 = 4,175, x y = 3,150, ( x x ) 2 = 168, ( y y ) 2 = 472, ( x x ) ( y y ) = ( x x ) y = 252. a) Test H 0 : = 0 vs. H a : > 0 at the = 0.01 level of significance. What can you say about the p-value of this test? r= ( x x )( y y ) (x x ) 2 (y y ) t= Test Statistic: 2 168 r n2 1 r Rejection Region: 252 = = 2 0.8949. 472 0.8949 7 2 1 0.8949 4.484. 2 Rejects H 0 if t > t 0.01 ( n 2 = 5 df ) = 3.365. Reject H 0 . t > t 0.005 ( 5 ) = 4.032 p-value < 0.005. P-value = Area to the right of t. ( p-value 0.00325. ) OR 1 2 W = ln 1 + r 1 1 + 0.8949 = ln = 1.446. 1 r 2 1 0.8949 Under H 0 , 1 2 W = ln 1+ 0 1 0 = 1 1+ 0 ln = 0, 2 1 0 2 W = 1 1 =. n3 4 z= Test Statistic: Rejection Region: W W W = 1.446 0 1 = 2.892. 4 Rejects H 0 if z > z 0.01 . Reject H 0 . z 0.01 = 2.326. P-value = right tail = P ( Z > 2.892 ) = 0.0019. b) Does it follow from part (a) that storks do bring babies? Explain. No. Correlation does not imply causation; there may be some other reasons (other variables, other factors) for the link (association). c) Test H 0 : = 0.50 vs. H 1 : > 0.50 at the = 0.05 level of significance. What can you say about the p-value of this test? 1 2 W = ln 1 + r 1 1 + 0.8949 = ln = 1.446. 1 r 2 1 0.8949 Under H 0 , 1 2 W = ln 2 W = Test Statistic: Rejection Region: z 0.05 = 1.645. 1+ 0 1 0 = 1 1 + 0.50 ln = 0.5493, 2 1 0.50 1 1 =. n3 4 z= W W W = 1.446 0.5493 1 = 1.79. 4 Rejects H 0 if z > z 0.05 . Reject H 0 . P-value = right tail = P ( Z > 1.79 ) = 0.0367. Test H 0 : = 0.30 vs. H 1 : 0.30 at the = 0.05 level of significance. What can you say about the p-value of this test? d) 1 2 W = ln 1 + r 1 1 + 0.8949 = ln = 1.446. 1 r 2 1 0.8949 Under H 0 , 1 2 W = ln 2 W = = 1 0 1 1 + 0.30 ln = 0.3095, 2 1 0.30 1 1 =. n3 4 z= Test Statistic: 1+ 0 W W = W 1.446 0.3095 1 = 2.273. 4 Rejects H 0 if z < z 0.025 or z > z 0.025 . Rejection Region: Reject H 0 . z 0.05 = 1.96. P-value = 2 tail = 2 P ( Z > ) 2.273 = 2 0.0116 = 0.0232. e) Construct a 95% confidence interval for . a = ln 2 z 2 2 1.96 1+ r = 2.892 = 0.932. 1 r n3 4 b = ln 2 z 2 2 1.96 1+ r + = 2.892 + = 4.852. 1 r n3 4 e 0.932 1 e 0.932 + 1 , e 4.852 1 e 4.852 + 1 100 ( 1 ) % confidence interval for ea 1 ea + 1 , eb 1 eb + 1 , ( 0.4350 , 0.9845 ) : where a = ln 2 z 2 2 z 2 1+ r 1+ r , b = ln + . 1 r 1 r n3 n3 3. Do NOT use a computer for this problem. Consider the following data set: x1 a) 16 1 25 11 8 9 9 4 16 13 4 12 18 5 15 24 5 150 X T Y = 470 , 1350 7 3 ( XT X ) 11 3 1 2 2 90 10 30 30 110 350 , Then X T X = 90 350 1170 12 2 where is are i.i.d. N ( 0, 2 ). 0 1 Y i = 0 + 1 x i 1 + 2 x i 2 + i ., y 1 Consider the model x2 17 14 0.775 0.45 0.075 = 0.45 0.45 0.1 , 0.075 0.1 0.025 ( y y ) 2 = 126, and ( y y ) 2 = 306. Obtain the least-squares estimates 0 , 1 , and 2 . = ( XT X ) 1 6 0.775 0.45 0.075 150 0.45 0.45 0.1 470 = 9 . XT Y = 2 0.075 0.1 0.025 1350 b) Perform the significance of the regression test at a 5% level of significance. Specify the null and the alternative hypotheses. Report the value of the test statistic, the critical value(s), and the decision. n = 10, p = 3. Source SS df MS F Regression 180 p1=2 90 5 Residuals 126 np=7 18 Total 306 n1=9 H 0 : 1 = 2 = 0. H a : at least one of 1 and 2 is significantly different from 0. Critical Value: F 0.05 ( 2 , 7 ) = 4.74. Decision: c) Reject H 0. ( p-value 0.0448. ) Find the p-value (approximately) of the test H 0 : 1 = 5 vs. H 1 : 1 > 5. Var ( 1 ) = C 11 s 2 = 0.45 18 = 8.1. t= Test Statistic: 95 1.405457. 8.1 n p = 7 d.f. 0.711 < 1.405457 < 1.415 t 0.25 ( 7 ) < t < t 0.10 ( 7 ) p-value = right tail. 0.10 < p-value < 0.25. ( p-value 0.10134. ) p-value 0.10. d) Find the p-value (approximately) of the test H 0 : 1 = 11 vs. H 1 : 1 11. Var ( 1 ) = C 11 s 2 = 0.45 18 = 8.1. t= Test Statistic: 9 11 0.70273. 8.1 n p = 7 d.f. 0.711 < 0.70273 < 0.263 t 0.25 ( 7 ) < t < t 0.40 ( 7 ) 0.25 < left tail < 0.40. left tail 0.25. 0.50 < p-value < 0.80. p-value = 2 tails. p-value 0.50. ( p-value 0.504916. ) e) Find the p-value (approximately) of the test H 0 : 2 = 0 vs. H 1 : 2 0. Var ( 2 ) = C 22 s 2 = 0.025 18 = 0.45. t= Test Statistic: ( 2) 0 0.45 2.9814. n p = 7 d.f. 2.998 < 2.9814 < 2.365 t 0.01 ( 7 ) < t < t 0.025 ( 7 ) p-value = 2 tails. 0.01 < left tail < 0.025 left tail 0.01. 0.02 < p-value < 0.05. p-value 0.02. ( p-value 0.020474. ) f) Construct a 90% prediction interval for the value of Y at x 1 = 2 and x 2 = 5. X 0T = [ 1 2 5 ] Y 0 = 1 6 + 2 9 + 5 ( 2 ) = 14. 0.775 0.45 0.075 1 X 0T C X 0 = [ 1 2 5 ] 0.45 0.45 0.1 2 = 0.15. 0.075 0.1 0.025 5 [ 1 + X 0T C X 0 ] s 2 = ( 1 + 0.15 ) 18 = 20.7. t 0.05 ( 7 ) = 1.895. g) 14 1.895 20.7 14 8.62 Construct a 95% prediction interval for the value of Y at x 1 = 4 and x 2 = 13. X 0T = [ 1 4 13 ] X 0T Y 0 = 1 6 + 4 9 + 13 ( 2 ) = 1 6 . 0.775 0.45 0.075 1 C X 0 = [ 1 4 13 ] 0.45 0.45 0.1 4 = 0.15. 0.075 0.1 0.025 13 [ 1 + X 0T C X 0 ] s 2 = ( 1 + 0.15 ) 18 = 20.7. t 0.025 ( 7 ) = 2.365. 16 2.365 20.7 16 10.76
