This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: Imbens, Lecture Notes 3, ARE213 Spring 06 1 ARE213 Econometrics Spring 2006 UC Berkeley Department of Agricultural and Resource Economics Ordinary Least Squares III: Omitted Variable Bias and Proxy Variables(W 4.3) A. Omitted Variable Bias Often we estimate a linear regression function but we are not completely sure that we have included all the relevant regressors. Here we investigate how omitting a variable affects the coefficients on the regressors we are most interested in. Suppose the true regression function is Y i = + 1 X i 1 + . . . + K X iK + Z Z i + i , with i ( X i , Z i ), and E [ i ] = 0. We refer to this as the long regression. Now suppose we estimate the regression function without Z i : Y i = + 1 X i 1 + . . . + K X iK + i , referred to as the short regression. This regression is largely definitional : the coefficients are defined to be = ( E [ XX ])- 1 ( E [ XY ]), so that E [ i X i ] = 0, but not necessarily i X i . In addition it is useful to consider the artificial regression of the omitted Z i on a constant and the X ik : Z i = + 1 X i 1 + . . . + K X iK + . Again this is definitional, choose = ( E [ XX ])- 1 ( E [ XZ ]) so that = Z- X is by definition uncorrelated with X . If we estimate the short regression, and focus on the k th regressor, we will estimate k . We are interested in the relation between k and k . To see what this will look like, consider Imbens, Lecture Notes 3, ARE213 Spring 06 2 the long regression, and substitute in for the omitted Z i : Y i = + 1 X i 1 + . . . + K X iK + Z Z i + i = + 1 X i 1 + . . . + K X iK + Z ( + 1 X i 1 + . . . + K X iK + ) + i = ( + Z ) + ( 1 + Z 1 ) X i 1 + . . . + ( K + Z K ) X iK + ( Z + i ) . Since the composite error term Z + is uncorrelated with the X s by definition, the regression coefficients in this representation are what you get from the short regression. So, k = k + Z k , or the omitted variable bias (the difference between the coefficient in the short regression, k , and the coefficient in the long regression, k ), is equal to the product of the coefficient on the omitted variable, Z , and the coefficient on the included regressor X ik in a regression of the omitted variable on all included regressors, k ....
View Full Document
This note was uploaded on 08/01/2008 for the course ARE 213 taught by Professor Imbens during the Spring '06 term at University of California, Berkeley.
- Spring '06