STAT6220 Chapter 2. Multiple Regression 1. Introduction Multiple regression allows more than one x variables. You can use multiple regression when exploring linear relationship between dependent and independent variables. The dependent variable should be measured on a continuous scale (such as interval or ratio scale). The predictor variables should be measured on a ratio, interval, or ordinal scale. A nominal predictor variable is legitimate but only if it is dichotomous, i.e., there are no more than two categories. Multiple regression requires a large number of observations. The number of cases (observation) must substantially exceed the number of predictor variables. 2 STAT6220 2. Model Independent (predictor) variables: x 1 ; : : : x k Data: ( y 1 ; x 11 ; x 12 ; : : : ; x 1 k ) ; ¢ ¢ ¢ ; ( y n ; x n 1 ; x n 2 ; : : : ; x nk ) The model: y = ¯ 0 + ¯ 1 x 1 + ¢ ¢ ¢ ¯ k x k + ² In terms of observations: y i = ¯ 0 + ¯ 1 x i 1 + ¢ ¢ ¢ ¯ k x ik + ² i , i = 1 ; : : : ; n , where ² 1 ; ² 2 ; : : : ; ² n are independently from N (0 ; ¾ 2 ) The coefficients ¯ j determines the contribution of the independent variable x j . ¯ j is called the partial coefficient, representing the slope of the line relating to x j when all the other x ’s are held fixed. ¯ 0 is the y intercept, the value of y when all the x ’s are zero.

3 STAT6220 When k = 2: 4 STAT6220 Regression coefficients: ^ ¯ 0 ; ^ ¯ 1 ;:::; ^ ¯ k are estimates of ¯ 0 ; ¯ 1 ; : : : ; ¯ k Fitted regression equation: ^ y = ^ ¯ 0 + ^ ¯ 1 x 1 + ¢¢¢ + ^ ¯ k x k Predicted value of y i : ^ y i = ^ ¯ 0 + ^ ¯ 1 x i 1 + ¢¢¢ + ^ ¯ k x ik Residuals: e i = y i ¡ ^ y i Goal: Using Least Square method, choose ^ ¯ 0 ; ^ ¯ 1 ;:::; ^ ¯ k to minimize the residual sum of squares SSE = RSS = n X i =1 e 2 i = n X i =1 ( y i ¡ ^ y i ) 2 = First order model A first order model includes only terms for quantitative independent variables. x 1 ; : : : x k are all quantitative variables that are not functions of other variables. ¯ j represents the expected change in y for a unit increase in x j while holding all other x ’s constant. Suppose the fitted model is ^ y = 1 + 2 x 1 + x 2 .
5 STAT6220 o When x 2 = 0 , the relationship between ^ y and x 1 is given ^ y = 1 + 2 x 1 . o Similarly when x 2 = 2 , ^ y = o If you repeat the process for other values of the fixed independent variables, you will obtain a set of parallel straight lines. o This indicates that the effect of the independent variable x j on E ( y ) is independent of all the other independent variables in the model, and this effect is measured by the slope ¯ j o We say the effect of the independent variables are additive . Interaction Effect of an x variable varies at different levels of another x variable. Contains two way cross product terms, for example, y = ¯ 0 + ¯ 1 x 1 + ¯ 2 x 2 + ¯ 3 x 1 x 2 + ² . With interaction term, effect of x 1 on y is measured by ¯ 1 + ¯ 2 x 2 . The effect changes as x 2 changes.

