Visualization but you cant easily use transformations

This preview shows page 420 - 423 out of 520 pages.

visualization. But you can’t easily use transformations (like splines)that return multiple columns. Including the transformations in themodel function makes life a little easier when you’re working withmany different datasets because the model is self-contained.Time of Year: An Alternative ApproachIn the previous section we used our domain knowledge (how the USschool term affects travel) to improve the model. An alternative tomaking our knowledge explicit in the model is to give the data moreroom to speak. We could use a more flexible model and allow that tocapture the pattern we’re interested in. A simple linear trend isn’tadequate, so we could try using a natural spline to fit a smoothcurve across the year:library(splines)mod<-MASS::rlm(n~wday*ns(date,5),data=daily)daily%>%data_grid(wday,date=seq_range(date,n=13))%>%add_predictions(mod)%>%ggplot(aes(date,pred,color=wday))+geom_line()+geom_point()394|Chapter 19: Model Building
We see a strong pattern in the numbers of Saturday flights. This isreassuring, because we also saw that pattern in the raw data. It’s agood sign when you get the same signal from different approaches.Exercises1.Use your Google sleuthing skills to brainstorm why there werefewer than expected flights on January 20, May 26, and Septem‐ber 1. (Hint: they all have the same explanation.) How wouldthese days generalize to another year?2.What do the three days with high positive residuals represent?How would these days generalize to another year?daily%>%top_n(3,resid)#> # A tibble: 3 × 5#>datenwday residterm#><date> <int> <ord> <dbl> <fctr>#> 1 2013-11-30857Sat 112.4fall#> 2 2013-12-01987Sun95.5fall#> 3 2013-12-28814Sat69.4fall3. Create a new variable that splits thewdayvariable into terms,but only for Saturdays, i.e., it should haveThurs,Fri, butSat-summer,Sat-spring,Sat-fall. How does this model comparewith the model with every combination ofwdayandterm?4.Create a newwdayvariable that combines the day of week, term(for Saturdays), and public holidays. What do the residuals ofthat model look like?5. What happens if you fit a day-of-week effect that varies bymonth (i.e.,n ~ wday * month)? Why is this not very helpful?6.What would you expect the modeln ~ wday + ns(date, 5)tolook like? Knowing what you know about the data, why wouldyou expect it to be not particularly effective?7.We hypothesized that people leaving on Sundays are more likelyto be business travelers who need to be somewhere on Monday.Explore that hypothesis by seeing how it breaks down based ondistance and time: if it’s true, you’d expect to see more Sundayevening flights to places that are far away.WhatA ectsthe Number of Daily Flights?|395
8.It’s a little frustrating that Sunday and Saturday are on separateends of the plot. Write a small function to set the levels of thefactor so that the week starts on Monday.

Upload your study docs or become a

Course Hero member to access this document

Upload your study docs or become a

Course Hero member to access this document

End of preview. Want to read all 520 pages?

Upload your study docs or become a

Course Hero member to access this document

Term
Spring
Professor
KENETT YOSSEF

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture