ECON 103, Lecture 7: Multiple regression model Maria Casanova April 21st (version 2) Maria Casanova Lecture 7

Requirements for this lecture: Chapter 4 and chapter 6 of Stock and Watson Maria Casanova Lecture 7
0. Introduction Remember the example in lecture 6 where we modeled the relationship between wages ( Y ) and fitness ( X 1 ) using a univariate regression model: Y i = β 0 + β 1 X 1 i + ε i By restricting our attention to the relationship between Y and X 1 we ignored some other potentially important determinants of wages such as age ( X 2 ). Omitting potentially relevant regressors can lead to an incorrect estimate of the population regression line (i.e. cause a bias in the OLS estimator ˆ β 1 ) in the presence of two conditions: 1 the omitted regressor ( X 2 ) is correlated with the regressor X 1 . 2 the omitted regressor ( X 2 ) affects the dependent variable Y . Maria Casanova Lecture 7

0. Introduction Figure: X 1 (fitness) is uncorrelated with X 2 (age) 0 1 2 3 4 5 6 7 8 9 10 0 500 1000 1500 fitness wage (highest) (lowest) X 1 = β 0 = 1000 β 1 = 0 Maria Casanova Lecture 7
0. Introduction Figure: X 1 (fitness) is correlated with X 2 (age) 0 1 2 3 4 5 6 7 8 9 10 0 500 1000 1500 fitness wage (highest) (lowest) X 1 = β 0 = 1000 β 1 = 0 Maria Casanova Lecture 7

0. Introduction Under conditions (1) and (2), the OLS estimator will have omitted variable bias , which means that the first least squares assumption does not hold, i.e. E ( ε i | X i ) 6 = 0 How do we address omitted variable bias? we can divide the data into smaller groups (e.g. run separate regressions of wage on fitness for individuals aged 20-25, 25-30, etc.) This can become unpractical as we add more regressors. Moreover, this estimate does not provide an overall measure of the effect on wages of increasing fitness holding age constant . The estimate of the effect on wages of changing fitness holding age constant can be obtained using the multiple regression model . Maria Casanova Lecture 7
1. Multiple regression model In the multiple regression model more than one variable affects the dependent variable.

