This preview shows pages 1–3. Sign up to view the full content.
The
distribution
of a variable shows its pattern of variation, as given by the values of the
variables and their frequencies. The following data set,
SAT_DATA.XLS
,
or
SAT_DATA.MTW
(data from College Board) contains the mean SAT scores for each of the
50 US states and Washington D.C., as well the participation rates and geographic region of each
state. The data patterns however are not yet clear. To get an idea of the pattern of variation of a
categorical variable such as region, we can display the information with a
bar graph
or
pie
chart
.
This should result in the following pie chart:
In Minitab, if you place your mouse over any slice of the pie you will get the value of the overall
percentage of the pie that region covers. For example, place your mouse over the blue colored
slice (again this has to be done in Minitab not on the notes!) and you will see that for the Region
MA (Mid Atlantic) 5.9% of the 50 states plus Washington D.C. fall into this category.
To produce a bar graph or bar chart, return to the menu bar in Minitab and from the Graph
options select Bar Chart then Simple. The steps will proceed similar from Step 3 above. In the
Minitab Bar Chart, however, placing your mouse over a bar produces the number within that
category. For example, if you place your mouse over the region labeled MA (again this has to be
done in Minitab not on the notes!) you will see that three (3) of the 50 states plus Washington
D.C. are classified as Mid Atlantic. Note that 3/51 equals the 5.9% from the pie chart:
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentBut what of variables that are
quantitative
such as math SAT or percentage taking the SAT? For
these variables we should use
histograms
,
boxplots
, or
stemandleaf plots
. Stemandleaf plots
are sometimes referred to as stemplots. Histograms differ from bar graphs in that the represent
frequencies by area and not height. A good display will help to summarize a distribution by
reporting the
center
,
spread
, and
shape
for that variable.
For now the goal is to summarize the distribution or pattern of variation of a
single
quantitative
variable. To draw a histogram by hand we would:
1.
Divide the range of data (range is from the smallest to largest value within the data for
the variable of interest) into classes of equal width. For the math SAT scores the range is
This is the end of the preview. Sign up
to
access the rest of the document.
 Spring '11
 AndyRegards

Click to edit the document details