This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Imbens, Lecture Notes 3, ARE213 Spring 06 1 ARE213 Econometrics Spring 2006 UC Berkeley Department of Agricultural and Resource Economics Ordinary Least Squares III: Omitted Variable Bias and Proxy Variables(W 4.3) A. Omitted Variable Bias Often we estimate a linear regression function but we are not completely sure that we have included all the relevant regressors. Here we investigate how omitting a variable affects the coefficients on the regressors we are most interested in. Suppose the true regression function is Y i = + 1 X i 1 + . . . + K X iK + Z Z i + i , with i ( X i , Z i ), and E [ i ] = 0. We refer to this as the long regression. Now suppose we estimate the regression function without Z i : Y i = + 1 X i 1 + . . . + K X iK + i , referred to as the short regression. This regression is largely definitional : the coefficients are defined to be = ( E [ XX ]) 1 ( E [ XY ]), so that E [ i X i ] = 0, but not necessarily i X i . In addition it is useful to consider the artificial regression of the omitted Z i on a constant and the X ik : Z i = + 1 X i 1 + . . . + K X iK + . Again this is definitional, choose = ( E [ XX ]) 1 ( E [ XZ ]) so that = Z X is by definition uncorrelated with X . If we estimate the short regression, and focus on the k th regressor, we will estimate k . We are interested in the relation between k and k . To see what this will look like, consider Imbens, Lecture Notes 3, ARE213 Spring 06 2 the long regression, and substitute in for the omitted Z i : Y i = + 1 X i 1 + . . . + K X iK + Z Z i + i = + 1 X i 1 + . . . + K X iK + Z ( + 1 X i 1 + . . . + K X iK + ) + i = ( + Z ) + ( 1 + Z 1 ) X i 1 + . . . + ( K + Z K ) X iK + ( Z + i ) . Since the composite error term Z + is uncorrelated with the X s by definition, the regression coefficients in this representation are what you get from the short regression. So, k = k + Z k , or the omitted variable bias (the difference between the coefficient in the short regression, k , and the coefficient in the long regression, k ), is equal to the product of the coefficient on the omitted variable, Z , and the coefficient on the included regressor X ik in a regression of the omitted variable on all included regressors, k ....
View
Full
Document
This note was uploaded on 08/01/2008 for the course ARE 213 taught by Professor Imbens during the Spring '06 term at University of California, Berkeley.
 Spring '06
 IMBENS

Click to edit the document details