Chapter 12
Inference for Two Numerical Populations
12.1
Comparing the Means of Two Populations; Independent
Samples
We have two populations. If you want to study them individually, use the methods of Chapter 11.
In this section we learn how to compare the populations, usingestimationandhypothesistesting.
In this section we assume that we have random samples from the two populations and that the
samples are independent. (Independent samples were discussed in Chapter 9.)
We begin with some notation. The Frst population has mean
μ
1
,s
tanda
rddev
ia
t
ion
σ
1
and
variance
σ
2
1
.Thesecondpopu
la
t
ionhasmean
μ
2
,standarddeviation
σ
2
and variance
σ
2
2
.
Of course, the researcher does not know these six numbers, butNaturedoes.
We begin with the problem of estimation. Our goal is to estimate
μ
1

μ
2
.Ourda
tacons
is
tof
independent random samples from the two populations.
Denote the data from the Frst population by:
x
1
,x
2
,...,x
n
1
;anddeno
tetheda
taf
romthe
second population by:
y
1
,y
2
,...,y
n
2
.
It is, of course, important to look at the data and think about the purpose of the research. If it
seems reasonable scientiFcally to compare the two populations by comparing their means, then we
will proceed with the methods introduced in this section.
We summarize our two sets of data by computing their means and standard deviations, which
are denoted by:
¯
X,S
1
,
¯
Y
and
S
2
when we view them as random variables, with observed values:
¯
x,s
1
,
¯
y
and
s
2
.
Our point estimate of
μ
1

μ
2
is
¯
X

¯
Y
.
There is a CLT for this problem too. ±irst, it shows us how to standardize our estimator:
W
=
(
¯
X

¯
Y
)

(
μ
1

μ
2
)
±
(
σ
2
1
/n
1
)+(
σ
2
2
/n
2
)
.
(12.1)
139
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentSecond, it states that we can approximate probabilities for
W
by using the snc and that in the limit
as both sample sizes become larger and larger, the approximations are accurate.
First, we need to eliminate the unknown parameters in the denominator of
W
.Becau
sethe
re
are now two unknown parameters where in Chapter 11 there was one, this will require additional
care. Second, we will need to decide what to use for our reference curve: the snc of the CLT (and
Slutsky) or the
t
curves of Gosset.
When all the smoke has cleared, statisticians suggest three methods, referred to in my text as
Cases 1, 2 and 3. I personally think that Case 2 is scienti±cally worthless, so we won’t cover it.
(It is
mathematically
interesting, which is, in my opinion, why books feature it. Me, I put it in my
book because I did not want to automatically lose the ‘Case 2 market.’)
We will begin with Case 3; I will follow the popular terminology and call this the large sample
approximation method.
12.1.1
Case 3: The Large Sample Approximation
Case 3 makes a lot of sense to the new student of Statistics: simply replace the population variances
by their corresponding sample variances. This changes our earlier
W
to
W
3
.(The3isforCase3
.)
W
3
=
(
¯
X

¯
Y
)

(
μ
1

μ
2
)
±
(
S
2
1
/n
1
)+(
S
2
2
/n
2
)
.
(12.2)
Case 3 states that we should use the snc as our reference curve.Th
isleadstothefo
l
low
ingformu
la
for the CI for
μ
1

μ
2
:
(¯
x

This is the end of the preview.
Sign up
to
access the rest of the document.
 Fall '11
 hanlon
 Variance, researcher, independent random samples

Click to edit the document details