Stat 231 Exam 3 Key F11

Stat 231 Exam 3 Key F11 - Stat 231 Exam 3 Fall 2011 I have...

Info iconThis preview shows pages 1–8. Sign up to view the full content.

View Full Document Right Arrow Icon
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 2
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 4
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 6
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 8
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Stat 231 Exam 3 Fall 2011 I have neither given nor received unauthorized assistance on this exam. KC/ Name Signed Date Name Printed This exam concerns the analysis of some data collected by the Norwegian Public Roads Administration (part of the "N02" data set available from the "statl ib" data sets archive, submitted to the archive by Magne Aldrin) concerning how air pollution at a road is related to traffic volume, and time-related and meteorological variables. The response variable is y : hourly values of the logarithm of the concentration of NO2 particles measured at Alnabru in Oslo, Norway, between October 2001 and August 2003. The predictor variables in the original data set were x1 : logarithm of the number of cars per hour x2 : temperature 2m above ground (0 C) x3 : wind speed (m/s) x4 : temperature difference between 25m and 2m above ground (0 C) x5 : wind direction (in degrees between 0 and 360) x6 : hour of day (from 0 to 24) x7 : day number from October 1, 2001 Vardeman also added predictors to the data set by defining the variables . 27rx5 27rx5 . 27rx6 27rx6 . 27rx7 27rx7 s1n ,cos ,s1n ,cos ,s1n , and cos 360 360 24 24 365 365 Here are the sample correlations between the variables in this problem. II Correlations yr 12 :13 Id :35 3:6 1'." SIMS ct:st 311110 0121st 31m? 1:091? 5' 10000 - x '— 01'13 =. 015-13; $1002.? 112 I10-31 10000 3.0213 . 0.?4'3-1 0 2022 :3 LI 30.1 -U L'.”-:- 1.0000 -0 10-10 -J-=0':-| 0021:: U 103'- -'J.310;I 0.282.“ -UL“!.' -'.-.‘::JJ;i 0.00:4 EJ ‘-U1-| 11-11- 0003? 00353: $1213.13 10000 0 5130 f. 27131 0?:1‘0 -.'J-13u|3 -f‘I-1;'-F -:'I?‘r'~I'I 03030 -0;'=3-3‘ -1'I '3130 :5 0 0230 01201 -I'.- 4031 0 3120 10000 0.2701 0 1-131 -0 0—‘0:I -. 3213-1 -0 102-" 0100? -0 03-32 -0 00."—I :0 0.3035 -0[073 0 0212 02F1E 32?01 1.0000 - ":22: -II.2-33:'~ -I; 003:- -0 0201 0.0133 0.2120 000-10 .1? 0.2502 LIL-3:13 '_-1-_|::-| U 21110 J ‘-'-':1 --_.13:'1 10000 -..| 133:: -L 00-1: U 113':1 -_I.:J-"-'.: - _-":-|: J -0 ‘00-1 slnxfi 3.1231 -0 2233 0.3100 -0 43113 --3 0-1-00 43.23.35 -01330 10000 0.324? 02033 -C=033T-' 003-03 00003 1105115 11033 --.'I1 2.: 0202? 0-100 -.1E?0.1 -F. “III-33' -0 011:3 5.242 1.0000 0000: 1212 02020 0 +001 51n1t0 I- 1231 -0 0004 011? -0 2150 -.'I '02? 41.0261 0 1EE1 0.2023 0 000-1 1.0000 0.1200 -0.-I'E-=10 0 0003 cu:st 0.20:1: 0101‘3 01003 0.3000 3 '00? I..u1-:-0 0.0??2 0 0007-“ L' 121; 01200 1.0000 0.;3-10 0 0040 slnfl 02033 -0T.'.3-1 03034 -0 033‘ 5.10033 [.2125 -0 2003 30303 I: 207-"; 5.10350 4212310 10000 0053-]- EDSH --.'- I'I |".-'I 4.107? -:'I 2027' 0101.1 -0 11130 -3 007-1 F 03-40 -:'I '.1'0.' 3 0003 011301 ."I I'.0F.F --.': 10.10 13453-1 10000 8 pts 21) What single predictor will produce the largest value of R2 for a simple linear regression analysis of y ? Explain. pred1ctor.—[_ - explanation: Z, MS 111A (3+ 4 E5 0 4' rv‘c M N (PK 0+ " r441th. €10.49 KL “an- 4 SLR ('S 'DVL MAM arI/Llw'l—IA LAT-WM Md 71M?— 74\( ’Pf‘élléhl’”, Z, woml/ TVDAAM 71w [MT/5+ l? Sufi . Below are two JMP reports using x1 as a predictor of y . Use these in parts b) through f). a; Lil—Blvariate Fit of gr By 11 4 4.5 5 5.5 0 0.5 F T5 0- 0.5 :1 '?‘—'- unear Ftl :0 Linear Flt— REFquare REquare Adj Rant mean Square Errer mean 01' Respense Obs ewa‘n'ens {er Sam qu} vi? res Mean Square {Halit- 0.00205 15.20?5 0.5?933 ProhxF 30.300330 0.00M‘ fiParamfler Estimates i Tenn Eslimate Std Ermr math: Prat-em Intercem 0.0523320 0.012?“ 0.00 0.42?2 :1 0.415951? 0.114131 391 00:30::' a [Ti-"alumina Flt or; a! :1 4 4.5 5 5.5 0 0.5 T T5 0 0.5. 11 'L Palyn emial Flt Degre e=2 I ai Polynomial Ht. Degree-=2 3' = 2202243 - 0002254031 - 003 00 322302 Ai Summary of Fit | RSQuare 0.20 E020 R00 Llare Adj 0.25030 Rent Mean Square Ermr 0.31000 Hean of Response 3103024 Mil" 005 enrati ens Eur Sum ngs} 40 F 0: Lack Of Fit | aiAnalysis af 1irrlariamal | Sum 01' Source 0F Squares Mean Square Hedel 2 0.01W13 0.45000 Error 3? 21.000003 0.503T5 C. Tulal 30 30.300310 .0? Parameter Estimates I Term Esflmate Std Error tRatlSu Probe-[11 Intersem 2.202243 4.032200 0.40 0.0402 11 0.002255 1.530500 -0.04 0.0000 1:1"? 0.0308322 0.11?30? 8 pts b) What is the value of a test statistic and a corresponding p-value for testing whether x1 and x12 together provide some statistically detectable predictive power for describing y ? 7. “M97 test statistic : p-value : .00/8 in 11M [no/(Al Fir/fl— a) 8 pts c) Is the quadratic model in x1 a statistically significant improvement over the linear one in terms of ability to predict or explain y ? (Answer "yes" or "no" and provide supporting values of a test statistic and its corresponding p-value.) E ( f ) Yes/@Circle one only) I z I test statistic : - (’f > p_Value : : 74a 5 8 pts (1) Give 95% two-sided confidence limits for the standard deviation of log concentration of NO2 particles at a given traffic volume, assuming there is a linear relation between y and x1 . (Plug in completely, but you need not simplify.) , ,ka] ~ M96 s U11 W< $ "’1‘ - ll“ Tl“ I! 'L 7%\ L— 3 7415 g— M -7415 5 5539 21.875 8 pts e) As it turns out, the standard error of predicted y for x1 : 7.0 is .1205 under the linear model. Use this fact and make 95% prediction limits for the log concentration of NO2 particles the next day that x1 : 7.0 at this location. (Plug in completely, but you need not simplify.) Vléc Pi ’5 s2; L+ 5" M- “l K p ,, .(aBL‘f + @495) (7)) 1’: 2.027- ong>L+ (1415f 8 pts f) y and x1 are both expressed on log scales. The least squares line for y and x1 corresponds to another function concentration of NO2 particles z g(trafic volume) where concentration of NO2 particles and trafic volume are on the scale of "counts" (not "log counts"). What is this fiinction? (Give the approximate relationship between traffic count and particle count implied by the SLR analysis.) («3; 1.6.7.4!- 4' (-‘F‘l—ao)2— — 44c 59 Q4131 :3 (7‘? (His) — 5x13(_é9:::—205XP(_ 07¢) 50 exfofi “,1 (,7? (652+) (91:24)" Henceforth, return to the use of filll set of predictors in the modeling of y . 4 pts g) Do the correlations on page 2 indicate that there is mulitcollinearity in the data set? Answer "yes" or "no" and explain carefully (say exactly which correlation(s) indicate what). 0 (Circle one only) explanation: 174144 N VLOVl-Z'CVD dflwolmM§ lgilww/L 4‘“ WWW ' 4 pts h) In the problem context, why is it not surprising that there is substantial correlation between x1 and x6 (and, for that matter, between x1 and the two variables Vardeman'r‘nzde from x6) ? TQ%L WlM—W pmlc wfl‘h (91% H": [Na/r,” m Mr 2 A 1‘: am am M H’bc. TV“ NZ Ill-5| S'I'IA ILA/e (0.51417; T’WAA ail ’fllA ’FaSS/IIOHI Volww 1'5 "s/hMSoIAaf" In fur if IL L4” Mr A - Below is a JMP report regarding mo els with the largest R2 for a given number of predictors. A. All P955191! Hut-III: Ordered up to 1192311 models up 1013 terms per model. Handel Number RSquare RHSE {2p :1 1 9.2356 9.?615 39.1529 1:1,»: 9.4191 9.6964 19.3111 2 11.1111? 3 952?? 9.9395 11.?935 11.x3.x?,$inx? 4 9.5949 9.59?9 ?.5929 :1.I3.x4.:7.sinx? 5 9.9239 9.5345 9.9?41 x1,x4,x5.x?,$inx5.993119 9 9.9519 9.5?99 9.2999 :1.x3.x4,35.x?.sin1:5,sinx? ? 9.991 9 9.5549 5.5?99 :1A3.x4.:5.x?.5inxfi,c05x9.$inx? 9 9.999? 9.54?9 5.9491 I1.I3.x4.x5.x?.sinx5.cnsxfi.cusrfi.sinxT 9 9.?132 9.5434 5.5945 x113.x4,115.x?.3in:5.c05:5,3inxfi,cnsx9,5inx? 19 9.?1 911 9.5499 92999 x112x31415.11?,51n:5.cus:5.simfi.cusfisrnx? 11 9.?19? 9.5519]I 19.9954 x1Jam94.35115,1?.5inx5,m5x5,sinxfi,msxfi,srnfl 12 91194 9.51555 12.9139 x112.113.x4_x5.x9.x?.5iH115.Du5x5.5in:9.cusx9.sim?.cusx? 13 9.?199 9.5??1 14.9999 4 pts i) From the information above, which model size looks potentially most effective? (Pick a number of predictors and justify your choice.) number : 7 — - 4 l22 MM” 1' I. MAW/\- 9+ V‘sz’fig‘rsr l g (Eggs sigh “4F m?“ MTMM W W P“ - 9‘“ l \uolos lilua Wfi-IUA ArM‘Pmeiy, {fa mx My 5 Hum- 8 pts j) Notice that from the earlier JMP reports we know that SST or : 30.89. Give the value of an F statistic and degrees of freedom for testing whether after accounting for traffic volume (x1) the other (meteorological and time-related) predictor variables together account for a statistically significant part ofthe variation in y. 5337‘( )27-(A/l ) {30.33) '= [.7l3é)(50. 95) = 22.22547t ‘” F. :CSSJZ/Ml)-$SR(V4/(uu1)V’l’fl- wi W‘— . —£‘8§z’1) ’l M Ms K SZ'zJZfi‘I' /(|'5 3:3‘39 (. 577]) 7" m k , I \ ’ , r5 ’ F: 335 d.f.= [2- , 2‘,” 4’ 44’0 Below is a JMP report for a k : 5 predictor model. Use it in what follows. Agran'aapum .4 gangsta P'rgdlmd' ' '_Fla_t awnule flodel 1-5 4.4mm by firedimd' Plai' _ “3 ' - _ [15 E an g 415 3" —1_n v1.5 —2_l] 1 2 3 4 5 E T 3' Predicted 123455? APE“ 5r Fredimed P<.anu1 PM! P7255 “"55 R3q;fl_52 [15135305345 15.?5594BEQT 162521969 (r k 4:: Studanflzad nosTii'y ' RSquara [1.523919 l?“ I Rsauara Adj I'H': .' . +195 ‘6 E Hunt Mean Square Error y\ u 9 Mean ol‘Raapanaa 3.?93IJ24 M g Dbaawatinna (or Sum wuta} 4.0 E DiAnalysia of Variance ' as 5” Al Parameter Estimates Term Esll'mate Std Error t Ram Prom-m M Irflaraapt 4105751? CHEESE 4109 0.9255 E” :41 0.4542492 anaaaaa 5.03 {9001' “'5 x3 —u.1fifi45T 04159094 2.32 [IflflBU‘ x4 DEBEDWJ‘I 0.143042 1.62 0.114? KT 0.1352143? 13.050516 3.49 Uflm-Ii' Simt'.Ir [1.4591451 [1.195313 2.40 U.fl219* 4 pts k) What is the sample correlation between y and )7 for the 5-predictor model? (Give a number.) LoVWlh‘hiM 124'qu 1 M q = —l-‘ TEZ = 4‘ .6259 ‘ .79 4 pts 1) Do the two plots on the right side of the previous page indicate any serious problems with the 5- redictor MLR model? What should the look like if "all is well"? 3W)“ ‘5 50M lfvr‘vfl' W'l—yflu brfiéfl' MEX SWM/C‘ST s’lkdel‘é‘éq Yt‘il'At/Ldls M “W0 CW‘OW'" Fe l’LrL'Ps \MC¢A§7 é Aisbrilaxl’im is MV1~1fiL¢X M WWL'SW '1) rm. [~ I WMT 'lD sot “WW7” “A W lsT Nae A_sh‘m"lfi‘ MAL M ’llM <44 71V“ Scum/z Aa/Ml’bs A bri— 4 S’qulnl— [NJ m bo’rk Mg. There are histograms for the predictors x1, x3, x4, and x7 on the last page of this exam. They may be helpful for thinking about the predictors. 8 pts m) According to the fitted 5-predictor model, )7 depends upon x7 through the function , 2 .00215x7 +.46915 s1n “7 365 Interpret what this says about how pollution is developing over time. (You'll need to say what both terms summed above "do.") /I1\A hm lam swflzsfi flurl‘ 107 Fulluhim [hr/Mags wrle TT‘M. . (Tl/u finwsoMnl +kW\ 3077(31'5 1']— fs mic/m, 441055 11% 345 AMI 1-(arj 8 pts 11) The set of conditions x1 : 7.0, x3 : 2.8, x4 : .27, x7 : 1095 represents more or less "average" traffic volume, wind speed, and temperature difference on a day about 3 years after the beginning the study (and 1 year beyond the last data point in the data set). For this set of conditions, using the 5-predictor model )7 z 5.06 and SE? z .4333 . Give 95% prediction limits for an observed y on that day. (Plug in completely, but you need not simplify.) M 5‘ (If t t >11“ 32’ HM Alvis is 5.042‘: 2.032 ,/g+3s>)‘+(_sfis)e (“46.9 = nah—l = W’s—{=3l7 8 pts 0) What about the set of conditions in n) potentially calls into question the usefulness of the predictions made there? This (‘5 4 almr C—X'hm- o'luhk .No objeNLA 76-) Ts WA 700’ wa'l/c rfcéao'hh— Art— 17:1095' (WM 714m 4 74M” WILEY 77mm 71% last‘ ob savVaThm) . AlleiFltfi‘F’HfiP'E—l mfiuanfliE—aiaments j Hean 2.942591 Std De'u' 1.9993519 Std Err Mean 9.1999221 Upper 9596 Mean 13942592 Lower 95% Mean 5.2999939 N 49 Mean 2.935 Std Dew 1.2991594 Std Err Mean 9.2999195 Upper 951: Mean 3.3TBT3T Lower 95% Mean 2.291293 N 49 D! Quanfilo: | A Moment: | Mean 9.25?5 Std De'u' 93995911 Std Err Mean 9.119299 Upper 95% Mean 9.4915919 Lower 95% Mean 9.9434991 N 49 '-1-9.5 9 9.5 1 1.5 2 2.5 3 PI Quanfllls I A Moment: I Mean 392.5 Std DEW 199.4?915 Std Err Mean 29.932’47'1 Upper 95% Mean 435.4193? Lower 95% Mean 32932993 N 49 .9 199 299 399 499 599 999 T9 ...
View Full Document

This note was uploaded on 02/11/2012 for the course STAT 231 taught by Professor Staff during the Fall '08 term at Iowa State.

Page1 / 8

Stat 231 Exam 3 Key F11 - Stat 231 Exam 3 Fall 2011 I have...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online