ID: 603-904-125 Stephanie Lee October 7, 2010 Stats10: Disc 2D-Mine In this lab, we are trying to see which batting statistics can best help predict the number of runs scored by a team in a season. Through an observational study in which data from thirty different Major League Baseball teams was collected in 2009, we compared different numerical variables to the number of runs a team made in a game in a scatterplot. Both graphically and numerically, we used various tools in Fathom , including adding a regression line, in order to examine the linear relationship between runs scored in a season and each of the other variables in the data. We found the variable that best helps us predict a team’s runs scored in a season to be the sum of on base and slugging percentages, underused statistics, which predicted approximately 92% of the runs. After seeing these results, we can expect teams to focus more on a players’ on base and slugging percentages in the process of recruiting potential new members because doing so appears to best predict number of runs made and in turn maximize a team’s success in Major League Baseball. 1. I used a scatterplot graph to show the linear relationship between runs and at-bats. The graph shows a moderate linear positive association between runs (dependent y-variable) and at-bats (in- dependent x-variable), seeing that as at-bats increase, runs increase as well. 600 650 700 750 800 850 900 950 at_bats 5350 5400 5450 5500 5550 5600 5650 5700 Batting09 Scatter Plot 2.The sum of squares decrease as the line approaches the best-fit position since the distance (squared in this case to make all values positive) of each point from the line is minimized.

