solutions_midterm_in class

solutions_midterm_in class - MS&E 226 Small Data In-Class...

This preview shows pages 1–3. Sign up to view the full content.

MS&E 226 In-Class Midterm Examination Solutions “Small” Data October 20, 2015 PROBLEM 1. Alice uses ordinary least squares to fit a linear regression model on a dataset containing outcome data Y and covariates X (assume all numeric covariates are numeric). She shares her results with Bob. Bob wants to replicate the results, and also uses ordinary least squares to fit a linear regression model, but does so after standardizing each column of data (the outcome as well as all covariates). When they compare the sum of squared residuals, they notice that they are wildly different. This catches Alice and Bob by surprise, because they were taught that standardizing doesn’t change anything for linear regression. Why was the sum of squared residuals so different in their respective fitted models? (a) Because the intercept is not scaled. (b) Because the outcome is measured on a different scale. (c) Because they should have compared the square root of the sum of squared residuals, instead of just the sum of squared residuals. (d) One of them must have made a coding mistake, because the sum of squared residuals should have been the same. Solution: (b) When outcomes are not measured in the same units, we cannot compare the sum of squared residuals directly. PROBLEM 2. Suppose we have data with covariates X and outcome Y , and we build a linear regression model of Y against the covariates X . Let A be the resulting R 2 value. Now suppose we add new covariates to X . However, assume these covariates are just random noise (e.g., they might be i.i.d. N (0 , 1) random variables), without any relationship to X or Y . We now build another linear regression model using all the original and new covariates, and compute the resulting R 2 value; let this be B . What can you say about how A and B are related to each other? (a) A B . (b) A = B . (c) A B . (d) Can’t say. 1

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Solution: (c) R 2 always increases when we add new covariates. PROBLEM 3. You are given data with covariates X and outcome Y , and fit three different models: one by ordinary least squares (OLS), one by ridge regression with λ > 0 , and one by lasso with λ > 0 . How does the sum of squared residuals compare across these methods?
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern