hw11_pengyu2.html - STAT 420 Homework 11 Pengyu Chen NetID...

Info icon This preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
STAT 420: Homework 11 Pengyu Chen, NetID: pengyu2 Assignment Exercise 1 ( prostate Data) Exercise 2 (Goalies Data, revisited) Exercise 3 (Body Dimensions) Assignment Exercise 1 ( prostate Data) Using the prostate dataset from the faraway package, fit a model with lpsa as the response and the other variables as predictors. For this exercise only consider first order predictors. library(faraway) lpsa_full_mod = lm(lpsa ~ ., data = prostate) lpsa_null_mod = lm(lpsa ~ 1, data = prostate) library(leaps) all_lpsa_mod = summary(regsubsets(lpsa ~ ., data = prostate)) p = length(coef(lpsa_full_mod)) n = length(resid(lpsa_full_mod)) (a) Find the model with the best AIC. Report the predictors that are used in the resulting model. lpsa_mod_aic = n * log(all_lpsa_mod$rss / n) + 2 * (2:p) best_aic_ind = which.min(lpsa_mod_aic) all_lpsa_mod$which[best_aic_ind, ] ## (Intercept) lcavol lweight age lbph svi ## TRUE TRUE TRUE TRUE TRUE TRUE ## lcp gleason pgg45 ## FALSE FALSE FALSE (lpsa_best_aic = lm(lpsa ~ . - lcp - gleason - pgg45, data = prostate)) ## ## Call: ## lm(formula = lpsa ~ . - lcp - gleason - pgg45, data = prostate) ## ## Coefficients: ## (Intercept) lcavol lweight age lbph svi ## 0.9510 0.5656 0.4237 -0.0149 0.1118 0.7210 lcavol , lweight , svi , lbph , and age are used in the resulting model. (b) Find the model with the best BIC. Report the predictors that are used in the resulting model. lpsa_mod_bic = n * log(all_lpsa_mod$rss / n) + log(n) * (2:p) best_bic_ind = which.min(lpsa_mod_bic) all_lpsa_mod$which[best_bic_ind, ] ## (Intercept) lcavol lweight age lbph svi
Image of page 1

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
## TRUE TRUE TRUE FALSE FALSE TRUE ## lcp gleason pgg45 ## FALSE FALSE FALSE (lpsa_best_bic = lm(lpsa ~ lcavol + lweight + svi, data = prostate)) ## ## Call: ## lm(formula = lpsa ~ lcavol + lweight + svi, data = prostate) ## ## Coefficients: ## (Intercept) lcavol lweight svi ## -0.268 0.552 0.509 0.666 lcavol , lweight , and svi are used in the resulting model. (c) Find the model with the best Adjusted \(R^2\). Report the predictors that are used in the resulting model. best_r2_ind = which.max(all_lpsa_mod$adjr2) all_lpsa_mod$which[best_r2_ind, ] ## (Intercept) lcavol lweight age lbph svi ## TRUE TRUE TRUE TRUE TRUE TRUE ## lcp gleason pgg45 ## TRUE FALSE TRUE (lpsa_best_r2 = lm(lpsa ~ . - gleason, data = prostate)) ## ## Call: ## lm(formula = lpsa ~ . - gleason, data = prostate) ## ## Coefficients: ## (Intercept) lcavol lweight age lbph svi ## 0.95393 0.59161 0.44829 -0.01934 0.10767 0.75773 ## lcp pgg45 ## -0.10448 0.00532 All the predictors except for gleason are used in the resulting model. (d) Of the four models you just considered, some of which may be the same, which is the best for making predictions? Use leave-one-out-cross-validated MSE or RMSE to decide. calc_loocv_rmse = function(model) { sqrt(mean((resid(model) / (1 - hatvalues(model))) ^ 2)) } calc_loocv_rmse(lpsa_best_aic) ## [1] 0.7369 calc_loocv_rmse(lpsa_best_bic) ## [1] 0.7381 calc_loocv_rmse(lpsa_best_r2) ## [1] 0.7411 The model with the best AIC is preferred in terms of LOOVC RMSE.
Image of page 2
Exercise 2 (Goalies Data, revisited) (a) Use the data found in goalies_cleaned.csv to find a “good” model for wins, W . Use any methods seen in class. The model should reach a Multiple R-squared above 0.90 using fewer than 35 parameters. Hint: you may want to look into the ability to add many interactions quickly in R .
Image of page 3

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern