Chapter 4
The Two Sample Location Problem
With Independent Samples
4.1 Wilcoxon Rank Sum Test
This test was first published by Wilcoxon in 1945. That is why some authors refer to this test as the Wilcoxon Rank
Sum Test (because it is based on the sum of ranks). Later, in 1947, Mann and Whitney published the same test with
more detail and some authors refer to it as the Wilcoxon, Mann-Whitney Rank Sum Test.
Data:
We have a random sample of m observations from population 1, denoted by X
1
,
• • •, X
m
and
another
independent
random sample of n observations from population 2, denoted by Y
1
, • • •, Y
n
.
The total number of observations is N = m + n. Without loss of generality we take
n < m
.
Assumptions:
o
A1
: The observations X
1
, • • •, X
m
are a random sample from population 1, thus they are
independent
of one
another and have the same distribution. The other sample, Y
1
, • • •, Y
n
are a random sample from
population 2, so they also are
of one another and have the same distribution.
o
A2
: The two samples are mutually independent of one another.
o
A3:
Population 1 and population 2 are both continuous populations.
o
A4
: Y
j
has the same distribution as X
i
+ ∆ for all i
= 1, • • •, m and j = 1, • ••, n.
o
The parameter ∆ is the
treatment effect
.
An Interpretation of the Parameter
∆
in the Two Sample Location Problem
• The parameter ∆ denotes the amount of shift, i.e. the separation between the two populations. [Assumption A4
means the two populations have the same distribution, they differ only in location.]
The tables that are contained in the text assume that
o
The sample size from population 2, n < m, the sample size from population 1.
o
That is,
"Population 2"
is the one from which the
smaller sample
is selected.
Procedure:
1.
Order the
combined sample
of N = m + n
X- and Y-values from smallest to largest.
2.
Let S
1
denote the rank of Y
1
,
S
2
denote the rank of Y
2
,
.
.., and S
n
denote the rank of Y
n
in
this joint ordering.