•
Population
–
A set of items (experimental units) under
study
•
Parameter
(Variable)
–
A descriptive measure of the population
that is of interest e.g. the mean (Typically
Unknown  Use Greek letters)
•
(Random)
Sample
–
A (random) subset chosen from the
population
•
Statistic
–
A descriptive measure that is calculated
from the sample, e.g. the sample mean (Use
regular (Roman) letter)
•
Purpose of Inferential Statistics
–
Making inferences about a
parameter
(Symbol –
μ
)
of a population based on
information obtained from a
statistic
of the
sample.
•
The
confidence level
is the proportion of times
that an estimation procedure will be correct. = U
•
The
significance level
measures how frequently
the conclusion will be wrong in the long run. = 1u
•
Interval
–
Values are real numbers.
–
All calculations are valid.
–
Data may be treated as ordinal or nominal.
•
Ordinal
–
Values must represent the ranked order of
the data.
–
Calculations based on an ordering process
are valid.
–
Data may be treated as nominal but not as
interval.
•
Nominal
–
Values are the arbitrary numbers that
represent categories.
–
Only calculations based on the frequencies
of occurrence are valid.
–
Data may not be treated as ordinal or
interval.
•
We can summarize the data in a table that presents
the categories and their counts called a
frequency
distribution.
Number. Bar Chart.
•
A
relative frequency distribution
lists the
categories and the proportion
with which each
occurs. Percentage. Pie chart.
•
The most important graphical method is the
histogram
. The histogram is not only a powerful
graphical technique used to
summarize
interval
data, but it is also used to help
explain
probabilities.
•
Skewness
–
A skewed histogram is one with a long tail
extending to either the right or the left:
•
Modality
–
A
unimodal
histogram is one with a single
peak
, while a
bimodal
histogram is one
with two peaks
:
•
How two interval variables are related 
scatter
diagram
, which plots two variables against one
another.
–
The
independent
variable is labeled X and
is usually placed on the horizontal axis,
while the other,
dependent
variable, Y, is
mapped to the vertical axis.
•
The following are typical measures of central
tendency for a population
–
Mean
 the average
–
Median
 the middle observation after the
data has been ordered
–
Mode
 the observation that occurs most
often
•
If a distribution
is symmetrical
, the mean, median
and mode coincide
•
If a distribution is non symmetrical, and
skewed
to
the left or to the right, the three measures differ.
Negative – skewed to the left.
•
Measures of Variability
–
Population
•
Variance
σ
2 ;
•
Standard Deviation
σ
–
Sample
•
Range
•
Variance s
2
;
•
Standard Deviation s
•
When we are talking about a sample, the
range
is
the difference between the highest and lowest
observation
•
POPULATION VARIANCE (σ
2
)
–
Averages the squares of the differences
