Stephanie Lee
October 7, 2010
Stats10: Disc 2DMine
In this lab, we are trying to see which batting statistics can best help predict the number
of runs scored by a team in a season.
Through an observational study in which data from thirty
different Major League Baseball teams was collected in 2009, we compared different numerical
variables to the number of runs a team made in a game in a scatterplot.
Both graphically and
numerically, we used various tools in
Fathom
, including adding a regression line,
in order to
examine the linear relationship between runs scored in a season and each of the other variables in
the data.
We found the variable that best helps us predict a team’s runs scored in a season to be
the sum of on base and slugging percentages, underused statistics, which predicted
approximately 92% of the runs.
After seeing these results, we can expect teams to focus more on
a players’ on base and slugging percentages in the process of recruiting potential new members
because doing so appears to best predict number of runs made and in turn maximize a team’s
success in Major League Baseball.
1. I used a scatterplot graph to show the linear relationship between runs and atbats.
The graph
shows a moderate linear positive association between runs (dependent yvariable) and atbats (in
dependent xvariable), seeing that as atbats increase, runs increase as well.
600
650
700
750
800
850
900
950
at_bats
5350
5400
5450
5500
5550
5600
5650
5700
Batting09
Scatter Plot
2.The sum of squares decrease as the line approaches the bestfit position since the distance
(squared in this case to make all values positive) of each point from the line is minimized.
1
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
This is the end of the preview. Sign up
to
access the rest of the document.
 Spring '08
 Ioudina
 Statistics, Scatter plot, Baseball statistics, runs, at_bats

Click to edit the document details