•
Population
–
A set of items (experimental units) under
study
•
Parameter
(Variable)
–
A descriptive measure of the population
that is of interest e.g. the mean (Typically
Unknown  Use Greek letters)
•
(Random)
Sample
–
A (random) subset chosen from the
population
•
Statistic
–
A descriptive measure that is calculated
from the sample, e.g. the sample mean (Use
regular (Roman) letter)
•
Purpose of Inferential Statistics
–
Making inferences about a
parameter
(Symbol –
μ
)
of a population based on
information obtained from a
statistic
of the
sample.
•
The
confidence level
is the proportion of times
that an estimation procedure will be correct. = U
•
The
significance level
measures how frequently
the conclusion will be wrong in the long run. = 1u
•
Interval
–
Values are real numbers.
–
All calculations are valid.
–
Data may be treated as ordinal or nominal.
•
Ordinal
–
Values must represent the ranked order of
the data.
–
Calculations based on an ordering process
are valid.
–
Data may be treated as nominal but not as
interval.
•
Nominal
–
Values are the arbitrary numbers that
represent categories.
–
Only calculations based on the frequencies
of occurrence are valid.
–
Data may not be treated as ordinal or
interval.
•
We can summarize the data in a table that presents
the categories and their counts called a
frequency
distribution.
Number. Bar Chart.
•
A
relative frequency distribution
lists the
categories and the proportion
with which each
occurs. Percentage. Pie chart.
•
The most important graphical method is the
histogram
. The histogram is not only a powerful
graphical technique used to
summarize
interval
data, but it is also used to help
explain
probabilities.
•
Skewness
–
A skewed histogram is one with a long tail
extending to either the right or the left:
•
Modality
–
A
unimodal
histogram is one with a single
peak
, while a
bimodal
histogram is one
with two peaks
:
•
How two interval variables are related 
scatter
diagram
, which plots two variables against one
another.
–
The
independent
variable is labeled X and
is usually placed on the horizontal axis,
while the other,
dependent
variable, Y, is
mapped to the vertical axis.
•
The following are typical measures of central
tendency for a population
–
Mean
 the average
–
Median
 the middle observation after the
data has been ordered
–
Mode
 the observation that occurs most
often
•
If a distribution
is symmetrical
, the mean, median
and mode coincide
•
If a distribution is non symmetrical, and
skewed
to
the left or to the right, the three measures differ.
Negative – skewed to the left.
•
Measures of Variability
–
Population
•
Variance
σ
2 ;
•
Standard Deviation
σ
–
Sample
•
Range
•
Variance s
2
;
•
Standard Deviation s
•
When we are talking about a sample, the
range
is
the difference between the highest and lowest
observation
•
POPULATION VARIANCE (σ
2
)
–
Averages the squares of the differences
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
This is the end of the preview.
Sign up
to
access the rest of the document.
 Fall '11
 Kim
 Normal Distribution, Standard Deviation, Variance, Probability distribution, Probability theory

Click to edit the document details