lab9f09_ans

lab9f09_ans - Stat401 C 1. ( a) Correlations y x1 x2 x3 x4...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
Stat401 C Assignment #9 Fall 2009 1. ( a) Correlations y x1 x2 x3 x4 x5 x6 y 1.0000 -0.4336 0.6448 0.4938 0.0947 0.0543 0.3696 x1 -0.4336 1.0000 -0.1900 -0.0627 -0.3497 0.3863 -0.4302 x2 0.6448 -0.1900 1.0000 0.9553 0.2379 -0.0324 0.1318 x3 0.4938 -0.0627 0.9553 1.0000 0.2126 -0.0261 0.0421 x4 0.0947 -0.3497 0.2379 0.2126 1.0000 -0.0130 0.1641 x5 0.0543 0.3863 -0.0324 -0.0261 -0.0130 1.0000 0.4961 x6 0.3696 -0.4302 0.1318 0.0421 0.1641 0.4961 1.0000 Scatterplot Matrix Both the sample correlations and the scatter plot matrix show that the variable x2 is highly correlated with x3 x1 is mildly correlated with x4, x5, and x6 x5 is mildly correlated with x6 y is somewhat mildly correlated with x1, x2, x3 , and x6 Since several significant correlations among pairs of variables exist, it could lead to large partial correlations of the type 2 4 3 2 1 x x x x R . This will lead to inflation of the standard errors of the estimates of the corresponding to coefficients ( β ’s) in the full model, that will be due the multicollinearity.
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
1.(b) Multiple regression of y on x1 to x6 Summary of Fit RSquare 0.669512 RSquare Adj 0.61119 Root Mean Square Error 14.63604 Mean of Response 30.04878 Observations (or Sum Wgts) 41 Analysis of Variance Source DF Sum of Squares Mean Square F Ratio Model 6 14754.636 2459.11 11.4797 Error 34 7283.266 214.21 Prob > F C. Total 40 22037.902 <.0001* Parameter Estimates Term Estimate Std Error t Ratio Prob>|t| Lower 95% Upper 95% VIF Intercept 111.72848 47.3181 2.36 0.0241* 15.56653 207.89043 . x1 -1.267941 0.62118 -2.04 0.0491* -2.53033 -0.005552 3.7639957 x2 0.0649182 0.015748 4.12 0.0002* 0.0329139 0.0969225 14.703652 x3 -0.039277 0.015133 -2.60 0.0138* -0.07003 -0.008523 14.340833 x4 -3.181366 1.815019 -1.75 0.0887 -6.869928 0.5071968 1.255519 x5 0.512359 0.362755 1.41 0.1669 -0.224848 1.249566 3.4049206 x6 -0.05205 0.162014 -0.32 0.7500 -0.381302 0.2772016 3.4436511 Residual by Predicted Plot See the attached page (next page) for additional graphics and the page following that for the residual and diagnostic statistics used in the discussions below . The following conclusions may be drawn from the above analysis: 1. Reject the hypothesis 0 6 : 1 0 = = = β H vs. : a H at least one not zero, at .05 significance level, using the F-statistic from the above anova table since the p-value < .05 2. The multiple correlation coefficient 67 . 2 = R so 67% of the variation is explained by the regression using all 6 variables. 3. The p-values associated with the t-tests of 6 , , 1 , 0 : 0 = = j H j as well as the 95% CI’s for the coefficients indicate that only the coefficients for x1,x2, and x3 are significantly different from zero at .05 level, when tested one-at-a-time. 4. There is an indication of multicollinearity associated with the variables x2 and x3 as evidenced from somewhat large VIF values. These were the two variables that were most highly correlated with each other.Thus a viable model may be found by eliminating some subset of variables from the full model. 5. The homogeneity of variance assumption is suspect as the plots each show distinct patterns showing dependence on the x variables. On the other hand, presence of a single outlier is also indicated in all plots.
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 01/23/2010 for the course STAT 213 taught by Professor Hao during the Spring '10 term at Internet2.

Page1 / 8

lab9f09_ans - Stat401 C 1. ( a) Correlations y x1 x2 x3 x4...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online