This preview shows page 1. Sign up to view the full content.
Unformatted text preview: (n  1)ù
éSST (n  1)ù
êë
úû
êë
úû It is an alternative measure of goodnessoffit The adjusted R 2 takes into account the number of variables in a model. It may therefore decrease as additional variables are added
Remarks: It’s easy to show that the adjusted R 2 is equal to R 2 = (1  R 2 )(n  1) / (n  k  1) We can compare the fit of 2 models (with the same y ) by comparing the adjusted R 2 We cannot use the adjusted R 2 to compare models with different y ’s (e.g. y vs. log(y ) ) VER. 9/25/2012. © P. KOLM 24 What Happens When We Have Too Many or Too Few Variables? This is an important question to consider when doing practical and empirical
research
Main results: Too many variables: What happens if we include variables in our specification that don’t belong?
o OLS estimators remain unbiased
o The variances of the estimators increase Too few variables: What if we exclude a variable from our specification that does belong?
o OLS will usually be biased
o This is called the omitted variable bias VER. 9/25/2012. © P. KOLM 25 The Omitted Variable Bias Examined Consider the two models: y = b0 + b1x1 + b2x 2 + u (true model)
y = b0 + b1x1 + u
(misspecified model)
with the regressions ˆ
ˆ
ˆˆ
y = b0 + b1x 1 + b2x 2 y = b0 + b1x 1 ˆ
Question: What is the relationship between the estimates b1 and b1 ?
Answer: It can be shown that ˆ
ˆ
b1 = b1 + b2d1 where d1 is the slope of a regression of x 2 on x 1 (i.e. x 2 = d0 + d1x 1 + e ). Therefore, ˆ
ˆ
the omitted variable bias in the twovariable linear regression is b  b = b d
1 1 21 ˆ
In general, b1 ¹ b1 unless:
ˆ
1. b2 º 0 (i.e. no partial effect of x 2 ), or 2. x 1 and x 2 are uncorrelated in the sample (d1 º 0) VER. 9/25/2012. © P. KOLM 26 Proof:
n We have that b1 = å (x
i =1
n å (x
i =1  x 1 ) yi i1  x1 ) 2 i1 for the model that omits x 2 Using the true model yi = b0 + b1xi 1 + b2x i 2 + ui we can determine the numerator of b1 as follows
n å (x
i =1 i1  x 1 ) yi =
n = å (x i 1  x 1 )(b0 + b1x i 1 + b2x i 2 + ui )
i =1 n n n i =1 i =1 = b1 å (x i 1  x 1 ) + b2 å (x i 1  x 1 ) x i 2 + å (x i 1  x 1 ) ui
i =1 VER. 9/25/2012. © P. KOLM 2 27 Therefore,
n b = b1 + b2 å (x
i =1
n i1 å (x
i =1  x1 ) xi 2
 x1 ) 2 i1 n + å (x
i =1
n i1 å (x
i =1  x 1 ) ui
 x1 ) 2 i1 Taking the expectation, we get
n E (b1 ) = b1 + b2 å (x
i =1
n i1 å (x
i =1  x1 ) xi 2
 x1 ) 2 i1 since E (ui ) = 0 . Now, consider the regression of x 2 on x 1 , x 2 = d0 + d1x 1 + e .
Then
n d1 = so E (b1 ) = b1 + b2d1 . We are done. VER. 9/25/2012. © P. KOLM å (x
i =1
n i1  x1 ) xi 2 å (xi1  x1 ) 2 i =1 28 Summary of Direction of Bias in the Two vs. ThreeVariable Case ˆ
ˆ
Bias: b1  b1 = b2d1 Corr(x1, x 2 ) > 0 Corr(x1, x 2 ) < 0 b2 > 0 Positive bias Negative bias b2 < 0 Negative bias Positive bias [For you: Can you determine the bias in the general case of k variables?]
VER. 9/25/2012. © P. KOLM 29 Omitted Variable Bias: Example (1/2) Estimating the model log(wage) = b0 + b0 + b1educ +...
View
Full
Document
This document was uploaded on 02/17/2014 for the course COURANT G63.2751.0 at NYU.
 Fall '14

Click to edit the document details