This preview shows pages 1–12. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: BUAD 310 12/02/08 Multiple Regression Multiple regression: y depends on k explanatory variables (e.g. three variables) X 1 ,X 2 ,X 3 ,,X k The mean response is a function of these variables The observed values of y vary about their means. We can think of subpopulations of responses, each corresponding to a particular set of values for all the explanatory variables x 1 ,x 2 ,x 3 ,,x k In each subpopulation y varies normally with a mean given by the population regression equation (*). The SD is the same in all subpopulations (*) ... 2 2 1 1 k k y x x x + + + + = Regression model The model is y = yx1,x2,,xk + = + 1 x 1 + 2 x 2 + + k x k + Assumptions for multiple regression are stated about the model error terms, s Error term assumptions 1. Mean of Zero Assumption The mean of the error terms is equal to 0 2. Constant Variance Assumption The variance of the error terms 2 is, the same for every combination values of x 1 , x 2 ,, x k 3. Normality Assumption The error terms follow a normal distribution for every combination values of x 1 , x 2 ,, x k 4. Independence Assumption The values of the error terms are statistically independent of each other Estimating the coefficients Estimate the model coefficients , 1 , 2 ,, k from the data by b , b 1 , b 2 , , b k . The method of least squares chooses the values of the b i s that make the sum of squares of the residuals as small as possible, i.e. that minimize ( 29  i ik k i i i x b x b x b b y 2 2 2 1 1 ... The formula for the b i s is complicated. Software will produce the estimates for you. Residuals and Sum of Squared Errors The ith residual is ik k i i i i i x b x b b y y y e = = = ... response predicted response observed 1 1  = = 2 2 ) ( i i i y y e SSE Mean Square Error This is the point estimate of the residual variance 2 1 2 = = k n SSE MSE s 1 = = k n SSE MSE s Note: when k=1, we divide by n 2 Interpretation of MLR Coefficients 1. Slope ( b i ) Estimated change in Y for each 1 unit increase in X i accounting for other variables in the model 2. YIntercept ( b ) Value of Y when all X i = 0 * *Note: be careful of this statement: it often does not make practical sense. Multiple coefficient of determination Proportion of variation in Y explained by all X variables taken together Increases when a new explanatory variable is added to the model. Disadvantage when comparing models of different size SST SSR R = = Variation Total Variation Explained 2 Adjusted multiple coefficient of determination Proportion of variation in Y explained by all X variables taken together, accounting for the model size k (and sample size n ) Smaller than R 2 Used to compare models of different size 2 1 1 1 = (  ) ( ) (   ) (  ) n k R Sq adj R n k n (dont need to know the formula for the final exam) ANOVA Table Source DF SS MS F P Regression...
View
Full
Document
 Fall '07
 Lv

Click to edit the document details