This preview shows page 1. Sign up to view the full content.
Unformatted text preview: Chapter 15 Multiple Regression
s s s s s Multiple Regression Model Least Squares Method Multiple Coefficient of Determination Model Assumptions Testing for Significance Slide 1 The Multiple Regression Model
s The Multiple Regression Model y = β 0 + β 1x1 + β 2x2 + . . . + β pxp + ε s The Multiple Regression Equation E(y) = β 0 + β 1x1 + β 2x2 + . . . + β pxp s The Estimated Multiple Regression Equation
^ y = b0 + b1x1 + b2x2 + . . . + bpxp Slide 2 The Least Squares Method
s Least Squares Criterion ^ min ∑ ( y i − y i )2 Computation of Coefficients’ Values The formulas for the regression coefficients b0, b1, b2, . . . bp involve the use of matrix algebra. We will rely on computer software packages to perform the calculations. Interpretation of Coefficients bi represents an estimate of the change in y corresponding to a oneunit change in xi when all other independent variables are held constant. Slide 3 s s The Multiple Coefficient of Determination
s Relationship Among SST, SSR, SSE SST = SSR + SSE 2 2 ^ ^2 ∑ (y i − y ) = ∑ (y i − y ) + ∑ (y i − y i ) Multiple Coefficient of Determination R 2 = SSR/SST Adjusted Multiple Coefficient of Determination n−1 2 2 Ra = 1 − (1 − R ) n − p−1 s s where n is the number of observations and p is the number of independent variables. Slide 4 Adjusted Multiple Coefficient of Determination
s Why is adjusted R 2 preferred? R 2 always increases as independent variables are added to the model, even if the added variables are not statistically significant, but the adjusted R 2 compensates for having too many independent variables in the model. Examples: • R 2 =.90, n=10, p=2: 2 Ra = 2 • R =.91, n=10, p=3:
2 Ra = s Slide 5 Model Assumptions
s Assumptions About the Error Term ε • The error ε is a random variable with mean of zero. • The variance of ε , denoted by σ 2, is the same for all values of the independent variables. • The values of ε are independent. • The error ε is a normally distributed random variable reflecting the deviation between the y value and the expected value of y given by β 0 + β 1x1 + β 2x2 + . . . + β pxp Slide 6 Testing for Significance: F Test
s s s Hypotheses H0: β 1 = β 2 = . . . = β p = 0 Ha: At least one β i is not equal to zero Test Statistic F = MSR/MSE Rejection Rule Reject H0 if F > Fα where Fα is based on an F distribution with p d.f. in the numerator and n p 1 d.f. in the denominator. Note: This test is to determine whether the multiple regression model as a whole is useful to explain y, i.e., at least one x– variable in the regression model is useful to explain y. Slide 7 s Testing for Significance: t Test
s s Hypotheses Test Statistic H0: β i = 0 Ha: β i = 0
bi t= sbi s Rejection Rule Reject H0 if t < tα/2 or t > tα/2 or where tα/2 is based on a t distribution with n p 1 degrees of freedom. s Note: This test is to determine whether each single x–variable (xi ) in the regression model is useful to explain y. Slide 8 Example: Programmer Salary Survey
A software firm collected data for a sample of 20 computer programmers. A suggestion was made that regression analysis could be used to determine if salary was related to the years of experience and the score on the firm’s programmer aptitude test. The years of experience, score on the aptitude test, and corresponding annual salary ($1000s) for a sample of 20 programmers is shown on the next slide. Slide 9 Example: Programmer Salary Survey Exper. Score Salary Exper. Score Salary 4 78 24 9 88 38 7 100 43 2 73 26.6 1 86 23.7 10 75 36.2 5 82 34.3 5 81 31.6 8 86 35.8 6 74 29 10 84 38 8 87 34 0 75 22.2 4 79 30.1 1 80 23.1 6 94 33.9 6 83 30 3 70 28.2 6 91 33 3 89 30 Slide 10 Example: Programmer Salary Survey
s Multiple Regression Model Suppose we believe that salary (y) is related to the years of experience (x1) and the score on the programmer aptitude test (x2) by the following regression model: y = β 0 + β 1x1 + β 2x2 + ε where y = annual salary ($000) x1 = years of experience x2 = score on programmer aptitude test Slide 11 Example: Programmer Salary Survey
s Multiple Regression Equation Using the assumption E (ε ) = 0, we obtain ) = 0, we obtain E(y ) = β 0 + β 1x1 + β 2x2 Estimated Regression Equation b0, b1, b2 are the least squares estimates of β 0, β 1, β 2 Thus
^ s y = b0 + b1x1 + b2x2 Slide 12 Example: Programmer Salary Survey
s Solving for the Estimates of β , β , β
0 1 2 Input Data x1 x2 y 4 78 24 7 100 43 . . . . . . 3 89 30 Least Squares Output b0 = b1 = b2 = R2 = etc. Computer Package for Solving Multiple Regression Problems Slide 13 Example: Programmer Salary Survey
s Minitab Computer Output The regression is Salary = 3.17 + 1.40 Exper + 0.251 Score Predictor Coef Stdev tratio p Constant 3.174 6.156 .52 .613 Exper 1.4039 .1986 7.07 .000 Score .25089 .07735 3.24 .005 s = 2.419 Rsq = 83.4% Rsq(adj) = 81.5% Slide 14 Example: Programmer Salary Survey
s Minitab Computer Output (continued) Analysis of Variance SOURCE DF SS MS F P Regression 2 500.33 250.16 42.76 0.000 Error 17 99.46 5.85 Total 19 599.79 Slide 15 Example: Programmer Salary Survey
s Questions: (a) What is the estimated multiple regression equation? (a) What is the estimated salary with 6 years of experience, and aptitude test score of 85? (a) Does the estimated regression equation provide a good fit? Explain. Slide 16 Example: Programmer Salary Survey
(d) Is there a significant relationship among the variables? Use alpha = 0.05. s F Test • Hypotheses H0: β 1 = β 2 = 0 • • Ha: At least one β i is not equal to zero Test Statistic Conclusion Slide 17 Example: Programmer Salary Survey
(e) Is there a significant relationship between each of the independent variables and the dependent variable? Should any of the independent variables be dropped from the model? Use alpha = 0.05. s t Test H0: β i = 0 • Hypotheses • • Ha: β i = 0 Test Statistics Conclusions Slide 18 ...
View Full
Document
 Summer '08
 Priya

Click to edit the document details