1
Sampling and sampling distributions.
Chapter 7.
So far, we have looked at simple techniques to display and
summarize data, and at probability as a measure of uncertainty.
Sample
Population
Describe the
sample data
Compute measures of
variability
•
Parameters
are numerical descriptive measures for
populations (fixed,
often unknown values)
For instance: For the normal distribution: the parameters
are location and shape , described by
µ
and
σ
.
For the binomial distribution : n and p.
•
Statistics
are numerical valued functions of the sample
observations.
(sample mean , sample variance).
• We use sample statistics to make inferences about the
corresponding population parameters.
•
Statistics vary from sample to sample and hence are
random variables.
Ò
Take 100 samples from the
middle class Americans and
record the age.
The average age of the population is
μ.
Population
X
1
…X
N
Sample 1,
35
x
1
=
Sample 2,
37
x
2
=
Sample 3,
34
x
3
=
Sample 4,
36
x
4
=
•The probability distributions for
statistics are called
sampling
distributions
.
•The sampling distributionthe
distribution of the possible values
of a statistic is the probability
distribution derived from
repeatedly extracting samples of
size n from the population.
•The sampling distribution depends
on a population distribution, sample
size n, the way you choose your
sample
Sampling variability
• Suppose I repeat the sampling 100 times. I will get different
mean values.
Values on the xaxes are the means computed in the 100 samples.
N
x
μ
i
∑
=
where:
N = Population size , n = sample size
Sample Statistics are used to estimate population parameters
Problems:
–
Different samples provide different estimates of the population
parameter
–
Sample results have potential variability, thus sampling error exits.
