Multiple Linear Regression

Unformatted text preview: C22.0103: Statistics for Business Control: Regression and Forecasting Hong Luo Section 003, Spring 2009 Tue/Thu/Fri, 11:00 - 12:15pm, Tisch 200 Stern School of Business New York University (Multiple) LINEAR REGRESSION Multiple Linear Regressions Model Building Multicollinearity Multiple Linear Regressions Model Building Multicollinearity Introduction to Multiple Linear Regression I A motivating example I The multiple linear regression model I Least squares estimates I Interpretation of the coefficients I Fitted value and residuals I R 2 and Adjusted R 2 A Motivating Example: gasoline consumption Shouldnt demand curve slopes downward? A Motivating Example: gasoline consumption I Theory suggests that demand for gasoline depends on price of gasoline and income level: G = f ( price , income ) I The upward slope seems to be the trace of equilibrium points along the supply curve. Q P D ( income 1 ) D ( income 2 ) D ( income 3 ) S I Micro 101 suggests that holding income constant , demand should be negatively correlated with price. Gasoline Consumption: US market, 1953 - 2004 Data File = newGas.MTP. Variables include: I GasQ: index number for gasoline physical consumption I GasP: index number for gasoline price I PCIncome: per capita real income I Pop: population in millions I PCGasQ: GasQ/Pop Gasoline Consumption: US market, 1953 - 2004 I All three variables are increasing over time. I Income is growing over time, which suggests that the main driver of gasoline consumption might be income not price The Simple Linear Regression: without controlling for income GasQPC = + 1 GasP + The coefficient of GasP is : when Price increases by 1 unit, consumption is expected to increase by 0.0016 units. Multiple Regression: holding income constant GasQPC = + 1 GasP + 2 PCIncome + I It looks like the theory works: controlling for income, we get negative price coefficient! The Multiple Linear Regression Model I Theory: y = f ( x 1 , x 2 , , x K ) I x 1 , x 2 , , x K are variables that may be useful in predicting y . I Theory specifies the model: there is a reason why certain variables appear in the equation. I The multiple linear regression model: y = + 1 x 1 + 2 x 2 + + K x K + I : the constant I k : the regression coefficient of x k . It represents the slope of the line relating y to x k when all the other x s are held fixed . I : the random noise, the collection of all factors that are not captured by x s. We make the same assumptions on as we did in the simple linear regressions. I is a RV with mean zero and variance 2 I is independent of x s The Multiple Linear Regression Model y = + 1 x 1 + 2 x 2 + + K x K + Again, the linear regression model is composed of two parts: I The systematic part: + 1 x 1 + 2 x 2 + + K x K I It accounts for the part of y that depends systematically on the explanatory variables ....
