2.1 Survey sampling
Descriptive and inferential statistics
Two types of statistical analysis exist to describe survey data:
descriptive
statistics
and
inferential statistics
.
Descriptive statistics focuses on summarizing survey data about a sample
drawn from a population. Summary statistics include measures of central
tendency such as mean, median, and mode; and dispersion such as range
and standard deviation. Descriptive statistics cannot make conclusions based
on the data. Rather, descriptive statistics is a way to present data in a
meaningful way.
Inferential statistics focuses on using information from the sample to make
conclusions about the population from which the sample was drawn. The two
primary methods of inferential statistics are confidence intervals, which specify
the range within which a parameter falls with a given probability, and
hypothesis testing, which allows differences between population parameters
to be compared.
Surveys
Surveys are conducted to allow statisticians to make generalizations about a
population.
A population is any collection of objects, people, or things about which
statistical inference are made. A parameter of a population is a numerical
characteristic of a population, such as mean, median, or standard deviation.
A sampling unit is an individual in the population on which a measurement can
be taken.
The sampling frame is the subset of the population from which a sample is
drawn.
The sample is composed of the sampling units that provide data to be
collected.
A statistic is a numerical characteristic of a sample, rather than the population.
The following animation shows the relationship between the population,
sampling unit, sampling frame, and sample.
2.1.1: Sampling a population.
1234
2x speed
Population
Sampling unit
Sampling frame
Sample
The sample is the subset of the sampling frame from which measurements are actually
taken.
The following animation shows the relationship between a parameter and a
statistic.
2.1.2: Parameters and statistics.
1234
2x speed
Population
Age
1
2
3
4
5
1
2
3
4
5
1
2
3
4
5
Sample A
Sample B
Joey
Sally
Population mean: 3
Parameter
Sample A's mean: 2.8
Sample B's mean: 3.2
Statistics
The population mean is a parameter. The mean of each sample is a statistic.
Example 2.1.1: The Liraglutide Effect and Action in Diabetes:
Evaluation of Cardiovascular Outcome Results (LEADER) clinical trial.
The LEADER clinical trial was initiated in 2010 at 410 hospitals in 32 countries
to evaluate the effect of liraglutide, a drug for treatment of type 2 diabetes, on
the frequency of cardiovascular diseases such as heart attack, stroke, and
heart failure (
Source
). The populations under study were type 2 diabetes
patients with excessively high blood sugar taking either liraglutide or a
placebo (an inactive drug). The parameters measured include blood sugar
level, kidney function measurement, frequency of adverse effects and
complications, and mortality rate. The overall goal of the trial was to determine

