This preview shows pages 1–6. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: When OLS Goes Wrong: Omitted Variables, Selection, Measurement Error, Missing Data and Simultaneous Equations PAM 3100 Professor Michael Lovenheim Fall 2010 PAM 3100 When OLS Goes Wrong: Potential Pitfalls with the OLS Estimator There are several potential problems that can cause endogeneity, which is the correlation of the error term with the explanatory variables. Thus, they cause our OLS estimators to be biased. We will examine several of the most common problems one might run into with linear regression and attempt to sign the associated biases: 1 Omitted Variables Bias: What happens when you omit a variable that belongs in the population model? 2 Selection: What happens when people are unobservably different in ways that are correlated with the Xs? 3 Measurement Error: Is it a problem if our data are measured with error? 4 Missing Data: How are our estimates affected by missing data points for individuals and by nonrandom samples? 5 Simultaneous Equations Bias: What if our regression equation has variables that are the outcome of an equilibrium process, so they are jointly determined (such as prices and quantities). PAM 3100 When OLS Goes Wrong: Potential Pitfalls with the OLS Estimator Often, the solution to these problems is very complicated and requires methods well outside the scope of this course. But, it is important to be aware of them in order to understand what parameter you are actually estimating when you run a regression. At the end of the course, we will talk about some techniques that can fix some of these problems. But, sometimes the endogeneity issues are so severe that we simply cannot learn anything from a regression. That is, our methodological approach may not be appropriate for the question we are studying. This is very contextspecific. PAM 3100 When OLS Goes Wrong: Omitted Variables Bias Lets say we have a population model: y = + 1 x 1 + 2 x 2 + u But, we cannot observe x 2 . Thus, we estimate y = + 1 x 1 + u . E[ 1 ]= cov ( y , x 1 ) var ( x 1 ) = cov ( + 1 x 1 + 2 x 2 , x 1 ) var ( x 1 ) = cov ( 1 x 1 , x 1 ) var ( x 1 ) + cov ( 2 x 2 , x 1 ) var ( x 1 ) = 1 + 2 cov ( x 2 , x 1 ) var ( x 1 ) 1 + 2 1 Here, 1 is the slope coefficient from a bivariate regression of x 2 on x 1 . If x 2 is correlated with y, so 2 negationslash = 0 and if x 1 and x 2 are correlated, so 1 negationslash = 0, then our bivariate regression will produced a biased estimate of 1 . The bias is given by 2 1 . PAM 3100 When OLS Goes Wrong: Omitted Variables Bias This bias is another way of thinking about the GaussMarkov condition that E[U  X]=0. This condition is saying that if there is a variable, x 2 , in y (such that 2 would be nonzero), the variable must be uncorrelated with x 1 (such that 1 is zero) in order for a bivariate regression to generate an unbiased estimate of 1 ....
View
Full
Document
This note was uploaded on 01/30/2012 for the course PAM 3100 at Cornell University (Engineering School).
 '08
 ABDUS,S.

Click to edit the document details