Chapter05_residual analysis

# Chapter05_residual analysis - Chapter 5 Residual Analysis...

• Notes
• 44

This preview shows page 1 - 7 out of 44 pages.

Chapter 5 Residual Analysis Summary 1. Various formulas for standard deviations and confidence/prediction intervals: Population model Y = β 0 + β 1 X 1 + β 2 X 2 + + β K X K + ε Prediction equation (least square equation) ^ Y = b 0 + b 1 X 1 + b 2 X 2 +…+ b K X K Residuals e = Y ^ Y (1) Standard deviation of Y (no regression model, that is, a model with only intercept): s = SYY n 1 . The 95% of Ys approximately lie within the range ( ´ Y 2 s, ´ Y + 2 s ) . The range will be more accurate if Y is normally distributed and n is large. And, the more exact number instead of 2 is 1.96. (2) Let MSE = SSE n −( K + 1 ) , where K=number of independent variables, K+1=number of parameters including intercept. (3) With model, the estimated standard deviation (sd) of error is s e = MSE . The 95% of residuals approximately lie within the range ( 2 s e , 2 s e ) from 0. The 95% of Y approximately lie within the range ( ^ Y 2 s e , ^ Y + 2 s e ) , that is, within ( 2 s e , 2 s e ) from the line ^ Y . The following summaries (4) to (6) apply only to a simple regression. (4) Estimated standard error (se) of slope s b = s e SXX . The 100 × ( 1 α ) confidence interval of the population slope β is ( b t 0 s b ,b + t 0 s b )
where the critical value t 0 is found so that the combined area under a t- distribution with (n-2) degrees of freedom above t 0 and below t 0 is alpha. That is, (area above t0) + (area below (-t0)) = alpha. The EXCEL function is t0 =T.INV.2T(alpha, df). Here, the number “2” indicates two-sided area and df = n-2. (5) Let s m = s e 1 n + ( X p ´ X ) 2 SXX The 100 × ( 1 α ) confidence interval for the mean value of Y at X = X p , where X p is a new value of X not already in the dataset of X, is ( ML,MU ) =( ^ Y t 0 s m , ^ Y + t 0 s m ) (6) Let s p = s e 1 + 1 n + ( X p ´ X ) 2 SXX The 100 × ( 1 α ) prediction interval of an individual value of Y at X = X p is ( PL, PU ) =( ^ Y t 0 s p , ^ Y + t 0 s p ) 2. Theory behind the regression analysis Y is normally distributed with mean μ y and σ y 2 , where μ y = E ( Y )= β 0 + β 1 X 1 + β 2 X 2 + + β K X K , the mean line as a function of Xs. Also, Y = E ( Y )+ ε
E ( Y ) is considered a fixed but unknown value. So σ y 2 = σ ε 2 . We want to estimate beta’s and then estimate E ( Y ) . Assumptions about Y: Normality Homogeneity Independence
Residual Analysis Readings: M&S Chapter 8, 8.1 - 8.7, a very good chapter Key words: Residuals (e) Standardized residuals ( z e ) Leverage (h) Studentized residuals ( t e ) Studentized residuals after removing current observation ( t e ¿ ) Cook’s Distance (D) PRESS DFFITS (difference in fits) DFBETAS (difference in betas) Tolerance and VIF (variance inflation factor) Details Residual e i = Y i ^ Y i . The keyword in OUTPUT statement of PROC REG is “RESIDUAL”. Use it to look for residual patterns to improve regression model. Standardized residual z i = Y i ^ Y i s e , where s e = MSE .

Want to read all 44 pages?

Want to read all 44 pages?

#### You've reached the end of your free preview.

Want to read all 44 pages?

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern