# LAB10.pdf - Statistics 323 Lab#10 Simple Linear Regression...

• 6
• 100% (1) 1 out of 1 people found this document helpful

This preview shows page 1 - 3 out of 6 pages.

Statistics 323 - Lab #10: Simple Linear Regression and R Lab Exercise 1: Was Barry Bonds using Steroids? The following bivariate data set gives the year and the number of home runs divided by the number of at bats - attempts to hit the ball - for each season. The number of homeruns is not used as later in his career he was given intential walks, which do not count as an at bat. Season Homerun to At-Bat Ratio (HRAT) 1987 0.045372 1988 0.044610 1989 0.032759 1990 0.063584 1991 0.049020 1992 0.071882 1993 0.085343 1994 0.094629 1995 0.065217 1996 0.081238 1997 0.075188 1998 0.067029 1999 0.095775 2000 0.102083 2001 0.1534 (a) Enter the data into R, using the variable names season and hrat .
(b) Plot the season on the X-axis and the Homerun to At Bat ratio on the Y-axis. To do so in R, type > plot(season, hrat, xlab="Year", ylab="HomeRuntoAtBatRatio", pch=16, col=‘‘blue’’) What do you notice about the 2001-data point? Comment. (c) Remove the “2001” data point from each data vector. To do so in R: > season1 = season[1:14] > hrat1 = hrat[1:14] Using R, express Bonds’ Homerun to At Bat ratio as a linear function of his ‘years of experience’, or ‘year’. hrat i = β 0 + β 1 Y ear i + e i To estimate the model using R, type of the following command: > regressdata = lm(hrat1~season1) > summary(regressdata) This produces the following output:
(d) Using the model estimate in (c), estimate Bonds’ homerun to at bat ratio in the year 2001. In 2001, Bonds had 476 at bats. Compare this to the “actual” value given in the data set. What do you think? What do the numbers say? Do you think Barry Bonds’ single-season homerun record in 2001 when he hit 73 homeruns was due, either in whole or in part, to his steroid use? Do the numbers indicate he was doing steriods? Discuss.
You will need these statistics to compute either a confidence interval for (i) μ Y | X =2001 or (ii) Y X =2001 . Iden- tify which of these confidence intervals you will need to estimate, with 95% confidence, the number of home runs Barry Bonds was to hit during the 2001 season?