This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: STOR 664 Homework 3 Solution Ch.4 Ex.2 The model is Y = X β + ϵ where Y = ( y 1 ,y 2 ,...,y n ) ′ , X = x 11 x 12 x 21 x 22 . . . . . . x n 1 x n 2 , β = ( β 1 ,β 2 ) ′ , ϵ = ( ϵ 1 ,ϵ 2 ,...,ϵ n ) . Write A = ∑ x 2 j 2 , B = ∑ x j 1 x j 2 , C = ∑ x 2 j 1 , and ∆ = AC B 2 , X ′ X = ( C B B A ) , ( X ′ X ) − 1 = 1 ∆ ( A B B C ) h ii = ( X ( X ′ X ) − 1 X ′ ) ii = 1 ∆ ( Ax 2 i 1 2 Bx i 1 x i 2 + Cx 2 i 2 ) Thus, the point of highest leverage is argmax i ( h ii ) = argmax i ( 1 ∆ ( Ax 2 i 1 2 Bx i 1 x i 2 + Cx 2 i 2 ) ) = argmax i ( Ax 2 i 1 2 Bx i 1 x i 2 + Cx 2 i 2 ) . Ch.4 Ex.6 First fit simple linear model regarding average temperature of Charleston (ch) as response variable y, average temperature of Mt. Airy (ma) as explanatory variable x. (On this step, number of obs. = 49) Coefficients: Estimate Std. Error t value Pr(>t) (Intercept) 13.63641 2.25682 6.042 2.32e07 *** ma 0.56790 0.09595 5.918 3.58e07 *** Residual standard error: 0.447 on 47 degrees of freedom Multiple RSquared: 0.427, Adjusted Rsquared: 0.4148 Fstatistic: 35.03 on 1 and 47 DF, pvalue: 3.577e07 The scatter plot of ch vs. ma suggests that there may be some outliers. The plot of residuals vs. fitted y values also gives some candidates for outliers. No pattern is found here. In order to see whether these residuals follow the Normal assumption and there is any possible candidate of outliers, I plotted Internally Standardized residual plot, Externally Standardized residual plot, and QQ plot. From plots, we can observe that observations 30, 38, 40 have residuals outlying over the range [ 2 , 2]. And shape of QQ plot does not follow the line, suspecting 3 points as the cause of nonNormality. Also seeking inﬂuential observations, plot diagonal elements of hat matrix, DFFITS, DFBETAS, Cook’s D statistics, and COVRATIO. Each dotted line at each plot is a critical value to detect inﬂuential observations. For example, lines in DFFITS plot are horizontal lines on +2 √ p/n and 2 √ p/n , (which are 0.404061,  0.404061) Inﬂuential observations detected by each plot are as follows. Leverage 5, 20, 25, 40 DFFITS 30, 38, 40, 41 DFBETAS 30, 40, 41 Cooks D 30, ,38, 40, 41 COVRATIO 5, 25, 38, 41 Simulation envelope plots of externally Studentized residuals and DFFITS also shows that observation 38, 41 are inﬂuential and seem aberrant if the residuals followed Normal distribution. 1 22.5 23.5 24.5 26.0 27.0 28.0 Charleston vs Mt. Airy with regression line ma ch 26.5 27.0 27.531 1 2 3 lm$fitted.value std Standardized residual plot 30 38 41 10 20 30 40 5031 1 2 3 Index Internally.Studentized.Residual Internally Standardized Resid 30 38 41 10 20 30 40 5031 1 2 3 Index Externally.Studentized.Residual Externally Studentized Resid 30 38 4121 1 231 1 2 3 Normal QQ Plot Theoretical Quantiles Sample Quantiles 10 20 30 40 50 0.02 0.06 0.10 Index inf$hat leverage plot 5 20 25 40 10 20 30 40...
View
Full Document
 Fall '11
 Staff
 Normal Distribution, Errors and residuals in statistics, Studentized residual, Dffits

Click to edit the document details