STAT 200  Elementary Statistics for Applications
Comparing Means between Two Independent
Groups
Camila Casquilho
The University of British Columbia, Department of Statistics
Adapted notes from: Eugenia Yu, The University of British Columbia, Department of Statistics
1 / 18
Comparing Means between Two Independent Groups
I
Objective: to compare the means of two
independent
populations
I
We draw a random sample from each of the two independent
populations:
•
y
11
, y
12
,
· · ·
, y
1
n
1
(sample size
n
1
) from a population with mean
μ
1
and standard deviation
σ
1
•
y
21
, y
22
,
· · ·
, y
2
n
2
(sample size
n
2
) from a population with mean
μ
2
and standard deviation
σ
2
I
Two samples are said to be
independent
if the individuals selected
for one sample do not dictate which individuals are to be in a second
sample.
I
Two samples are said to be
dependent
or
paired
when the individuals
selected to be in one sample determine the individuals to be included
in the second sample.
2 / 18
Distinguishing between independent and dependent
samples
I
Let’s consider two scenarios:
1
You want to compare the mean IQ between males and females.
To
test for a difference, you randomly select 20 females and 20 males.
The two sets of IQ scores (one per gender group) are
independent
of
each other.
2
You want to compare the mean IQ between the older and younger
siblings of twin pairs. To test for a difference, you randomly select 20
twin pairs. One sample of IQ scores come from the older siblings of the
20 twin pairs, and the other sample come from the younger siblings of
the same 20 twin pairs. There is a
paired
structure between the two
samples.
3 / 18
4 / 18
Sampling distribution for difference in means of two
independent populations
I
To estimate
μ
1

μ
2
, we use
y
1

y
2
where
y
1
and
y
2
are sample
means from the two samples.
I
Given that the 2 random samples are independent, the
sampling
distribution
of
y
1

y
2
will have
•
mean
:
μ
1

μ
2
(unbiased)
•
standard deviation
:
SD
(
y
1

y
2
) =
q
σ
2
1
n
1
+
σ
2
2
n
2
(See Appendix for the mathematical derivation of the expression for
the standard deviation.)
I
As long as the two samples are
independent
(their sizes
n
1
and
n
2
also need to be sufficiently large if the distributions of
y
1
and
y
2
are
unknown), the sampling distribution of
y
1

y
2
will follow the Normal
model.
5 / 18