hw11_pengyu2.html

# hw11_pengyu2.html - STAT 420 Homework 11 Pengyu Chen NetID...

This preview shows pages 1–4. Sign up to view the full content.

STAT 420: Homework 11 Pengyu Chen, NetID: pengyu2 Assignment Exercise 1 ( prostate Data) Exercise 2 (Goalies Data, revisited) Exercise 3 (Body Dimensions) Assignment Exercise 1 ( prostate Data) Using the prostate dataset from the faraway package, fit a model with lpsa as the response and the other variables as predictors. For this exercise only consider first order predictors. library(faraway) lpsa_full_mod = lm(lpsa ~ ., data = prostate) lpsa_null_mod = lm(lpsa ~ 1, data = prostate) library(leaps) all_lpsa_mod = summary(regsubsets(lpsa ~ ., data = prostate)) p = length(coef(lpsa_full_mod)) n = length(resid(lpsa_full_mod)) (a) Find the model with the best AIC. Report the predictors that are used in the resulting model. lpsa_mod_aic = n * log(all_lpsa_mod\$rss / n) + 2 * (2:p) best_aic_ind = which.min(lpsa_mod_aic) all_lpsa_mod\$which[best_aic_ind, ] ## (Intercept) lcavol lweight age lbph svi ## TRUE TRUE TRUE TRUE TRUE TRUE ## lcp gleason pgg45 ## FALSE FALSE FALSE (lpsa_best_aic = lm(lpsa ~ . - lcp - gleason - pgg45, data = prostate)) ## ## Call: ## lm(formula = lpsa ~ . - lcp - gleason - pgg45, data = prostate) ## ## Coefficients: ## (Intercept) lcavol lweight age lbph svi ## 0.9510 0.5656 0.4237 -0.0149 0.1118 0.7210 lcavol , lweight , svi , lbph , and age are used in the resulting model. (b) Find the model with the best BIC. Report the predictors that are used in the resulting model. lpsa_mod_bic = n * log(all_lpsa_mod\$rss / n) + log(n) * (2:p) best_bic_ind = which.min(lpsa_mod_bic) all_lpsa_mod\$which[best_bic_ind, ] ## (Intercept) lcavol lweight age lbph svi

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
## TRUE TRUE TRUE FALSE FALSE TRUE ## lcp gleason pgg45 ## FALSE FALSE FALSE (lpsa_best_bic = lm(lpsa ~ lcavol + lweight + svi, data = prostate)) ## ## Call: ## lm(formula = lpsa ~ lcavol + lweight + svi, data = prostate) ## ## Coefficients: ## (Intercept) lcavol lweight svi ## -0.268 0.552 0.509 0.666 lcavol , lweight , and svi are used in the resulting model. (c) Find the model with the best Adjusted \(R^2\). Report the predictors that are used in the resulting model. best_r2_ind = which.max(all_lpsa_mod\$adjr2) all_lpsa_mod\$which[best_r2_ind, ] ## (Intercept) lcavol lweight age lbph svi ## TRUE TRUE TRUE TRUE TRUE TRUE ## lcp gleason pgg45 ## TRUE FALSE TRUE (lpsa_best_r2 = lm(lpsa ~ . - gleason, data = prostate)) ## ## Call: ## lm(formula = lpsa ~ . - gleason, data = prostate) ## ## Coefficients: ## (Intercept) lcavol lweight age lbph svi ## 0.95393 0.59161 0.44829 -0.01934 0.10767 0.75773 ## lcp pgg45 ## -0.10448 0.00532 All the predictors except for gleason are used in the resulting model. (d) Of the four models you just considered, some of which may be the same, which is the best for making predictions? Use leave-one-out-cross-validated MSE or RMSE to decide. calc_loocv_rmse = function(model) { sqrt(mean((resid(model) / (1 - hatvalues(model))) ^ 2)) } calc_loocv_rmse(lpsa_best_aic) ## [1] 0.7369 calc_loocv_rmse(lpsa_best_bic) ## [1] 0.7381 calc_loocv_rmse(lpsa_best_r2) ## [1] 0.7411 The model with the best AIC is preferred in terms of LOOVC RMSE.
Exercise 2 (Goalies Data, revisited) (a) Use the data found in goalies_cleaned.csv to find a “good” model for wins, W . Use any methods seen in class. The model should reach a Multiple R-squared above 0.90 using fewer than 35 parameters. Hint: you may want to look into the ability to add many interactions quickly in R .

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern