3. ** R is necessary for the remaining questions. The movie Moneyball is about how proper use of statistics in baseball (called "sabermetrics") can

bring unexpected success to a low—ranked, low-budget team. In it, the manager of the Oakland A’s

believes that (then) unpopular statistics, like a player’s ability to get on base, can predict the team’s

ability to score runs better than traditional statistics, such as homerun counts and batting averages.

By recruiting players who scored high in these underused statistics, he was able to improve the record

of the team without needing to spend exorbitant amounts of money on the more mainstream players. We will examine the data from the 30 MLB teams during the 2009 season. We will search for linear

relationships between potential explanatory variables and the response variable: the number of runs

scored in a season, which we treat as a measure of "success" for this data analysis. You don’t need to

know the rules of baseball to understand this question, but if you would like a refresher you can check

out Wikipedia: https://en.wikipedia.org/wiki/Basebal1_m1es#Ga.meplay In addition to runs scored, there are seven traditionally—used variables in the data set: at—bats, hits,

homeruns, batting average, strikeouts, walks and stolen bases. The last three variables in the data set

are "nontraditional": on—base percentage, slugging percentage, and on base plus slugging. (a) Import the 2009 MLB dataset into R Studio using read.csv() or read.table(). The dataset

can be found on Canvas in the ﬁle "mlb09.csv". Make sure the data ﬁle is placed in your current

working directory. (b) Plot at_bats on the x—axis and runs on the y—axis. Describe the relationship between the two

variables in terms of direction (positively or negatively correlated). (c) How conﬁdent would you rate your ability to predict a team’s season runs scored, if you just knew

the team’s at—bats? (d) Find the slope and intercept of the regression line through the dataset. Plot the corresponding

line over the scatterplot in (b) (e) Suppose the manager of a team comes and asks you to predict how many runs his team will score

if they get 5000 at-bats, 5500 at-bats, and 6000 at—bats. What would you predict for each case?