{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

lect3_06jan18

lect3_06jan18 - Imbens Lecture Notes 3 ARE213 Spring'06...

This preview shows pages 1–3. Sign up to view the full content.

Imbens, Lecture Notes 3, ARE213 Spring ’06 1 ARE213 Econometrics Spring 2006 UC Berkeley Department of Agricultural and Resource Economics Ordinary Least Squares III: Omitted Variable Bias and Proxy Variables(W 4.3) A. Omitted Variable Bias Often we estimate a linear regression function but we are not completely sure that we have included all the relevant regressors. Here we investigate how omitting a variable affects the coefficients on the regressors we are most interested in. Suppose the true regression function is Y i = β 0 + β 1 · X i 1 + . . . + β K · X iK + β Z · Z i + ε i , with ε i ( X i , Z i ), and E [ ε i ] = 0. We refer to this as the “long regression.” Now suppose we estimate the regression function without Z i : Y i = γ 0 + γ 1 · X i 1 + . . . + γ K · X iK + η i , referred to as the “short regression.” This regression is largely definitional : the coefficients are defined to be γ = ( E [ XX ]) - 1 ( E [ XY ]), so that E [ η i · X i ] = 0, but not necessarily η i X i . In addition it is useful to consider the “artificial regression” of the omitted Z i on a constant and the X ik : Z i = δ 0 + δ 1 · X i 1 + . . . + δ K · X iK + ν. Again this is definitional, choose δ = ( E [ XX ]) - 1 ( E [ XZ ]) so that ν = Z - X δ is by definition uncorrelated with X . If we estimate the short regression, and focus on the k th regressor, we will estimate γ k . We are interested in the relation between γ k and β k . To see what this will look like, consider

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Imbens, Lecture Notes 3, ARE213 Spring ’06 2 the long regression, and substitute in for the omitted Z i : Y i = β 0 + β 1 · X i 1 + . . . + β K · X iK + β Z · · Z i + ε i = β 0 + β 1 · X i 1 + . . . + β K · X iK + β Z · ( δ 0 + δ 1 · X i 1 + . . . + δ K · X iK + ν ) + ε i = ( β 0 + β Z · δ 0 ) + ( β 1 + β Z · δ 1 ) · X i 1 + . . . + ( β K + β Z · δ K ) · X iK + ( β Z · ν + ε i ) . Since the composite error term β Z · ν + ε is uncorrelated with the X s by definition, the regression coefficients in this representation are what you get from the short regression. So, γ k = β k + β Z · δ k , or the omitted variable bias (the difference between the coefficient in the short regression, γ k , and the coefficient in the long regression, β k ), is equal to the product of the coefficient on the omitted variable, β Z , and the coefficient on the included regressor X ik in a regression of the omitted variable on all included regressors, δ k .
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

Page1 / 8

lect3_06jan18 - Imbens Lecture Notes 3 ARE213 Spring'06...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online