STA309 Exam 1
Examining distribution with graphs
•
Variables:
the characteristics of the individuals.
Individuals may be people or objects.
A variable can take different values
for different individuals.
o
Types of variables:
Quantitative variables:
takes numerical values for which arithmetic makes sense.
•
Examples: class size, height, pulse, haircut cost, money in coins
Categorical variables:
record which category a person or thing falls into.
•
Examples: gender, feelings about class
•
Distributions:
Describes a variable’s pattern of variation. Tells the values of the variable and how often they occur.
o
Why is there variation?
Individuals vary.
Measurement errors.
•
Categorical variables
:
o
Distribution lists the categories and gives either the count or the percent of individuals in each category:
o
Pie charts
help us see what part of the whole each group forms
o
Bar charts:
Height of the bars shows the counts (or percents) in the categories
•
Quantitative variables
o
Histograms
: Divide the range of data into classes of equal width. Count the number of observations in each class
Draw the histogram with height equal to the count.
In excel:
•
Rows
cases
•
Columns
variables
•
Bin
the center point of the interval; type in numbers you want for the range and select that
column
•
To make bars touch, click on bar and select “format data series”
•
Use excel only for shape; data very wrong; esp. 1
st
interval
•
Descriptive stats: skewness (+) right skewed; () left skewed; > 1 pretty skewed
o
Stem plots
: Separate each observation into a Stem  all but final digit Leaf – the final digit Write the stems from top
(smallest) to bottom (largest) Write each leaf to the right of the stem in increasing order.
Advantages of Stem plots: Used for small data sets. Quicker to make. Presents more detailed information.
o
Time plots
: Illustrates measurements taken over time Xaxis  time Yaxis measurement. Used to describe trends,
seasonality, fluctuations, and cycles
•
Examining distributions
:
Look for the overall pattern and striking deviations from the pattern (such as outliers)
o
Use histograms and stemplots to describe:
Shape
•
Symmetric  right and left sides are approximately mirror images (don't expect perfection in real
data)
•
Skewed  one tail extends longer than the other side
o
Right skewed means long tail on right (quite common)
o
Sometimes you may have to omit outliers (b/c they could be incorrect)
Center
Spread
•
Bimodal: two humps (ex. Old Faithful’s eruptions)
Describing distributions with Numbers:
•
Measuring the Center:
o
Median:
The median is the midpoint of a distribution.
Half of the observations are smaller.
Half of the
observations are larger.
To find the median:
•
1. Arrange all observations in ascending order.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
This is the end of the preview.
Sign up
to
access the rest of the document.
 Spring '07
 Gemberling
 Normal Distribution, Standard Deviation, Center

Click to edit the document details