{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

Lecture 6

# Lecture 6 - Introduction to Econometrics Lecture 6 Multiple...

This preview shows pages 1–3. Sign up to view the full content.

Introduction to Econometrics 6-1 Technical Notes Introduction to Econometrics Lecture 6: Multiple Regression - Goodness of Fit Measuring the Goodness of Fit - R 2 For a multiple regression measures of goodness of fit based on the correlation coefficient between any individual regressor Z and Y Corr(Z,Y) = Cov(Z,Y)/ (Var(Z)Var(Y)) are not very useful. However we can still decompose Var(Y) into explained and unexplained components 1 . In the two variable case Y = α e + β e X + θ e Z + U e Var(Y) = Var( α e + β e X + θ e Z + U e ) = ( β e ) 2 Var(X) + 2 β e θ e Cov(X,Z) + 2 β e Cov(X,U e ) + ( θ e ) 2 Var(Z) + 2 θ e Cov(Z,U e ) + Var(U e ) since α e is a constant. In addition, Cov(X,U e ) and Cov(Z,U e ) are both zero (this was shown in Lecture 5), so this expression reduces to Var(Y) = {( β e ) 2 Var(X) + 2 β e θ e Cov(X,Z) + ( θ e ) 2 Var(Z)} + Var(U e ) The first three terms (in {}) are the Explained Sum of Squares SSE, the last the Residual Sum of Squares SSR (each divided by N-1) 2 . So R 2 = (Var(Y) - Var(U e ))/Var(Y) = (SST - SSR)/SST = 1 - SSR/SST This decomposition is exactly the same as in the case of simple regression: similarly the AN alysis O f VA riance table (set out in Lecture 4, repeated below) is equally useful for a multivariate regression. The value of R 2 depends on the precise form of Y (since Var(Y) is the variance around mean (Y)), so there is a vital rule to remember Never use R 2 to compare the fit of regressions with different dependent variables In addition the statistical significance associated with the value of R 2 is a function of the number of observations and the number of explanatory variables, so a second rule is Do not use R 2 (even informally) to compare regressions with a different number of observations R 2 always increases if more regressors are added, so if you want to compare two regressions with the same dependent variable you should use adjusted R 2 (R-bar squared) instead. Adjusted R 2 is defined as adjusted R 2 = R 2 - {(K-1)/(N-K)}(1 - R 2 ) so it can either increase or decrease if more regressors are added. It’s often better to compare two regressions using the square root of the estimated residual variance s = ( Σ (u e i ) 2 /(N-K)) 1 The logic behind this decomposition for simple regression was discussed in Lecture 4. 2 Remember the warning (in Lecture 4) about the possible alternative meanings for the abbreviations SSE and SSR.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document