An Introduction to Statistical Learning with Applications in R - Gareth James, Daniela Witten, Trevo

Coefficientslmmpg

This preview shows page 208 - 211 out of 440 pages.

> boot.fn=function(data,index) + coefficients(lm(mpg horsepower+I(horsepower^2),data=data, subset=index)) > set.seed(1) > boot(Auto,boot.fn,1000) ORDINARY NONPARAMETRIC BOOTSTRAP Call: boot(data = Auto, statistic = boot.fn, R = 1000) Bootstrap Statistics : original bias std. error t1* 56.900 6.098e-03 2.0945 t2* -0.466 -1.777e-04 0.0334 t3* 0.001 1.324e-06 0.0001 > summary(lm(mpg horsepower+I(horsepower^2),data=Auto))$coef Estimate Std. Error t value Pr(>|t|) (Intercept) 56.9001 1.80043 32 1.7e-109 horsepower -0.4662 0.03112 -15 2.3e-40 I(horsepower^2) 0.0012 0.00012 10 2.2e-21 5.4 Exercises Conceptual 1. Using basic statistical properties of the variance, as well as single- variable calculus, derive (5.6). In other words, prove that α given by (5.6) does indeed minimize Var( αX + (1 α ) Y ). 2. We will now derive the probability that a given observation is part of a bootstrap sample. Suppose that we obtain a bootstrap sample from a set of n observations. (a) What is the probability that the first bootstrap observation is not the j th observation from the original sample? Justify your answer.
Image of page 208

Subscribe to view the full document.

200 5. Resampling Methods (b) What is the probability that the second bootstrap observation is not the j th observation from the original sample? (c) Argue that the probability that the j th observation is not in the bootstrap sample is (1 1 /n ) n . (d) When n = 5, what is the probability that the j th observation is in the bootstrap sample? (e) When n = 100, what is the probability that the j th observation is in the bootstrap sample? (f) When n = 10 , 000, what is the probability that the j th observa- tion is in the bootstrap sample? (g) Create a plot that displays, for each integer value of n from 1 to 100 , 000, the probability that the j th observation is in the bootstrap sample. Comment on what you observe. (h) We will now investigate numerically the probability that a boot- strap sample of size n = 100 contains the j th observation. Here j = 4. We repeatedly create bootstrap samples, and each time we record whether or not the fourth observation is contained in the bootstrap sample. > store=rep(NA, 10000) > for(i in 1:10000){ store[i]=sum(sample(1:100, rep=TRUE)==4)>0 } > mean(store) Comment on the results obtained. 3. We now review k -fold cross-validation. (a) Explain how k -fold cross-validation is implemented. (b) What are the advantages and disadvantages of k -fold cross- validation relative to: i. The validation set approach? ii. LOOCV? 4. Suppose that we use some statistical learning method to make a pre- diction for the response Y for a particular value of the predictor X . Carefully describe how we might estimate the standard deviation of our prediction. Applied 5. In Chapter 4, we used logistic regression to predict the probability of default using income and balance on the Default data set. We will now estimate the test error of this logistic regression model using the
Image of page 209
5.4 Exercises 201 validation set approach. Do not forget to set a random seed before beginning your analysis.
Image of page 210

Subscribe to view the full document.

Image of page 211

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern