LinearRegression3

9252012 p kolm 23 adjusted r squared adjusted r

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: (n - 1)ù éSST (n - 1)ù êë úû êë úû It is an alternative measure of goodness-of-fit The adjusted R 2 takes into account the number of variables in a model. It may therefore decrease as additional variables are added Remarks: It’s easy to show that the adjusted R 2 is equal to R 2 = (1 - R 2 )(n - 1) / (n - k - 1) We can compare the fit of 2 models (with the same y ) by comparing the adjusted- R 2 We cannot use the adjusted- R 2 to compare models with different y ’s (e.g. y vs. log(y ) ) VER. 9/25/2012. © P. KOLM 24 What Happens When We Have Too Many or Too Few Variables? This is an important question to consider when doing practical and empirical research Main results: Too many variables: What happens if we include variables in our specification that don’t belong? o OLS estimators remain unbiased o The variances of the estimators increase Too few variables: What if we exclude a variable from our specification that does belong? o OLS will usually be biased o This is called the omitted variable bias VER. 9/25/2012. © P. KOLM 25 The Omitted Variable Bias Examined Consider the two models: y = b0 + b1x1 + b2x 2 + u (true model) y = b0 + b1x1 + u (misspecified model) with the regressions ˆ ˆ ˆˆ y = b0 + b1x 1 + b2x 2 y = b0 + b1x 1 ˆ Question: What is the relationship between the estimates b1 and b1 ? Answer: It can be shown that ˆ ˆ b1 = b1 + b2d1 where d1 is the slope of a regression of x 2 on x 1 (i.e. x 2 = d0 + d1x 1 + e ). Therefore, ˆ ˆ the omitted variable bias in the two-variable linear regression is b - b = b d 1 1 21 ˆ In general, b1 ¹ b1 unless: ˆ 1. b2 º 0 (i.e. no partial effect of x 2 ), or 2. x 1 and x 2 are uncorrelated in the sample (d1 º 0) VER. 9/25/2012. © P. KOLM 26 Proof: n We have that b1 = å (x i =1 n å (x i =1 - x 1 ) yi i1 - x1 ) 2 i1 for the model that omits x 2 Using the true model yi = b0 + b1xi 1 + b2x i 2 + ui we can determine the numerator of b1 as follows n å (x i =1 i1 - x 1 ) yi = n = å (x i 1 - x 1 )(b0 + b1x i 1 + b2x i 2 + ui ) i =1 n n n i =1 i =1 = b1 å (x i 1 - x 1 ) + b2 å (x i 1 - x 1 ) x i 2 + å (x i 1 - x 1 ) ui i =1 VER. 9/25/2012. © P. KOLM 2 27 Therefore, n b = b1 + b2 å (x i =1 n i1 å (x i =1 - x1 ) xi 2 - x1 ) 2 i1 n + å (x i =1 n i1 å (x i =1 - x 1 ) ui - x1 ) 2 i1 Taking the expectation, we get n E (b1 ) = b1 + b2 å (x i =1 n i1 å (x i =1 - x1 ) xi 2 - x1 ) 2 i1 since E (ui ) = 0 . Now, consider the regression of x 2 on x 1 , x 2 = d0 + d1x 1 + e . Then n d1 = so E (b1 ) = b1 + b2d1 . We are done. VER. 9/25/2012. © P. KOLM å (x i =1 n i1 - x1 ) xi 2 å (xi1 - x1 ) 2 i =1 28 Summary of Direction of Bias in the Two- vs. Three-Variable Case ˆ ˆ Bias: b1 - b1 = b2d1 Corr(x1, x 2 ) > 0 Corr(x1, x 2 ) < 0 b2 > 0 Positive bias Negative bias b2 < 0 Negative bias Positive bias [For you: Can you determine the bias in the general case of k variables?] VER. 9/25/2012. © P. KOLM 29 Omitted Variable Bias: Example (1/2) Estimating the model log(wage) = b0 + b0 + b1educ +...
View Full Document

This document was uploaded on 02/17/2014 for the course COURANT G63.2751.0 at NYU.

Ask a homework question - tutors are online