Multiple Linear Regression Data Mining Prof. Dawn Woodard School of ORIE Cornell University 1 Outline 1 Announcements 2 Linear Regression 2 Announcements Questions? I will be handing out a practice exam on Friday. Please complete it on your own before next Friday 10/09, when we will have an in-class review. The first exam will be in class Friday 10/16. It will include the material covered up through 10/14. It will be in-class and closed-book. Homework 4 and hopefully 5 will be handed back before the exam. 4 Linear Regression How is RSS (residual sum of squares) defined? How is R 2 defined? 6

Linear Regression RSS = N i = 1 ( Y i ˆ Y i ) 2 , where the sum is over the whole data set (no train and test sets here!) How do we obtain the estimates ˆ β k of the coefficients β k ? They are chosen to minimize RSS 7 Linear Regression Say we have 3 available predictors, X 1 , X 2 , and X 3 . Compare Model A: Y i = β 0 + β 1 X 1 i + β 2 X 2 i + i to Model B: Y i = β 0 + β 1 X 1 i + β 2 X 2 i + β 3 X 3 i + i How does the RSS for Model A compare to that for Model B? 8 Linear Regression Say the RSS for Model A is 355, with ˆ β 0 = 3 . 2, ˆ β 1 = 14 . 0, and ˆ β 2 = 1 . 7 For Model B, are there values of β 0 , β 1 , β 2 , and β 3 which give us a RSS of 355 or lower?
