hw3-soln

# hw3-soln - STOR 664 Homework 3 Solution Ch.4 Ex.2 The model...

This preview shows pages 1–3. Sign up to view the full content.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: STOR 664 Homework 3 Solution Ch.4 Ex.2 The model is Y = X β + ϵ where Y = ( y 1 ,y 2 ,...,y n ) ′ , X = x 11 x 12 x 21 x 22 . . . . . . x n 1 x n 2 , β = ( β 1 ,β 2 ) ′ , ϵ = ( ϵ 1 ,ϵ 2 ,...,ϵ n ) . Write A = ∑ x 2 j 2 , B = ∑ x j 1 x j 2 , C = ∑ x 2 j 1 , and ∆ = AC- B 2 , X ′ X = ( C B B A ) , ( X ′ X ) − 1 = 1 ∆ ( A- B- B C ) h ii = ( X ( X ′ X ) − 1 X ′ ) ii = 1 ∆ ( Ax 2 i 1- 2 Bx i 1 x i 2 + Cx 2 i 2 ) Thus, the point of highest leverage is argmax i ( h ii ) = argmax i ( 1 ∆ ( Ax 2 i 1- 2 Bx i 1 x i 2 + Cx 2 i 2 ) ) = argmax i ( Ax 2 i 1- 2 Bx i 1 x i 2 + Cx 2 i 2 ) . Ch.4 Ex.6 First fit simple linear model regarding average temperature of Charleston (ch) as response variable y, average temperature of Mt. Airy (ma) as explanatory variable x. (On this step, number of obs. = 49) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 13.63641 2.25682 6.042 2.32e-07 *** ma 0.56790 0.09595 5.918 3.58e-07 ***--- Residual standard error: 0.447 on 47 degrees of freedom Multiple R-Squared: 0.427, Adjusted R-squared: 0.4148 F-statistic: 35.03 on 1 and 47 DF, p-value: 3.577e-07 The scatter plot of ch vs. ma suggests that there may be some outliers. The plot of residuals vs. fitted y values also gives some candidates for outliers. No pattern is found here. In order to see whether these residuals follow the Normal assumption and there is any possible candidate of outliers, I plotted Internally Standardized residual plot, Externally Standardized residual plot, and QQ plot. From plots, we can observe that observations 30, 38, 40 have residuals outlying over the range [- 2 , 2]. And shape of QQ plot does not follow the line, suspecting 3 points as the cause of non-Normality. Also seeking inﬂuential observations, plot diagonal elements of hat matrix, DFFITS, DFBETAS, Cook’s D statistics, and COVRATIO. Each dotted line at each plot is a critical value to detect inﬂuential observations. For example, lines in DFFITS plot are horizontal lines on +2 √ p/n and- 2 √ p/n , (which are 0.404061, - 0.404061) Inﬂuential observations detected by each plot are as follows. Leverage 5, 20, 25, 40 DFFITS 30, 38, 40, 41 DFBETAS 30, 40, 41 Cooks D 30, ,38, 40, 41 COVRATIO 5, 25, 38, 41 Simulation envelope plots of externally Studentized residuals and DFFITS also shows that observation 38, 41 are inﬂuential and seem aberrant if the residuals followed Normal distribution. 1 22.5 23.5 24.5 26.0 27.0 28.0 Charleston vs Mt. Airy with regression line ma ch 26.5 27.0 27.5-3-1 1 2 3 lm\$fitted.value std Standardized residual plot 30 38 41 10 20 30 40 50-3-1 1 2 3 Index Internally.Studentized.Residual Internally Standardized Resid 30 38 41 10 20 30 40 50-3-1 1 2 3 Index Externally.Studentized.Residual Externally Studentized Resid 30 38 41-2-1 1 2-3-1 1 2 3 Normal Q-Q Plot Theoretical Quantiles Sample Quantiles 10 20 30 40 50 0.02 0.06 0.10 Index inf\$hat leverage plot 5 20 25 40 10 20 30 40...
View Full Document

## This note was uploaded on 11/17/2011 for the course STOR 664 taught by Professor Staff during the Fall '11 term at UNC.

### Page1 / 10

hw3-soln - STOR 664 Homework 3 Solution Ch.4 Ex.2 The model...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online