setseed 9 trainsize nrow Advertising trainindex sample 1 trainsize size trunc

Setseed 9 trainsize nrow advertising trainindex

This preview shows page 3 - 6 out of 8 pages.

set.seed(9)train_size = nrow(Advertising)
Background image
train_index = sample(1:train_size, size = trunc(0.50 * train_size))train_data = Advertising[train_index, ]test_data = Advertising[-train_index, ]We will look at two measures that assess how well a model is predicting, the train RMSE and the test RMSE.Here nTr is the number of observations in the train set. Train RMSE will stillalways go down (or stay the same) as the complexity of a linear modelincreases. That means train RMSE will not be useful for comparing models,but checking that it decreases is a useful sanity check.Here nTe is the number of observations in the test set. Test RMSE uses themodel fit to the training data, but evaluated on the unused test data. This isa measure of how well the fitted model will predict in general, not simplyhow well it fits data used to train the model, as is the case with train RMSE.What happens to test RMSE as the size of the model increases? That iswhat we will investigate.fit_0 = lm(Sales ~ 1, data = train_data)get_complexity(fit_0)## [1] 0# train RMSEsqrt(mean((train_data$Sales - predict(fit_0, train_data)) ^ 2))## [1] 4.788513# test RMSEsqrt(mean((test_data$Sales - predict(fit_0, test_data)) ^ 2)) ## [1] 5.643574The previous two operations obtain the train and test RMSE. Since these are operations we are about to use repeatedly, we should use the function that we happen to have already written.# train RMSErmse(actual = train_data$Sales, predicted = predict(fit_0, train_data))## [1] 4.788513# test RMSErmse(actual = test_data$Sales, predicted = predict(fit_0, test_data))## [1] 5.643574This function can actually be improved for the inputs that we are using. We would like to obtain train and test RMSE for a fitted model, given a train or test dataset, and the appropriate response variable.get_rmse = function(model, data, response) {rmse(actual = data[, response], predicted = predict(model, data))}
Background image
By using this function, our code becomes easier to read, and it is more obvious what task we are accomplishing.get_rmse(model = fit_0, data = train_data, response = "Sales") # train RMSE## [1] 4.788513get_rmse(model = fit_0, data = test_data, response = "Sales") # test RMSE## [1] 5.6435745.4 Adding Flexibility to Linear ModelsEach successive model we fit will be more and more flexible using both interactions and polynomial terms. We will see the training error decrease each time the model is made more flexible. We expect the test error to decrease a number of times, then eventually start going up, as a result of overfitting.fit_1 = lm(Sales ~ ., data = train_data)get_complexity(fit_1)## [1] 3get_rmse(model = fit_1, data = train_data, response = "Sales") # train RMSE## [1] 1.637699get_rmse(model = fit_1, data = test_data, response = "Sales") # test RMSE## [1] 1.737574fit_2 = lm(Sales ~ Radio * Newspaper * TV, data = train_data)get_complexity(fit_2)## [1] 7get_rmse(model = fit_2, data = train_data, response = "Sales") # train RMSE## [1] 0.7797226
Background image
Image of page 6

You've reached the end of your free preview.

Want to read all 8 pages?

  • Spring '08
  • Statistics, RMSE, train RMSE

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern

Stuck? We have tutors online 24/7 who can help you get unstuck.
A+ icon
Ask Expert Tutors You can ask You can ask You can ask (will expire )
Answers in as fast as 15 minutes