a bias in coin tossing in favor of the side facing up at the start of the toss. The biased proportion
was estimated to be 50
.
8% in place of the commonly expected 50%. In 2009 two Berkeley students
each flipped a coin 20,000 times, one starting with Heads facing up, the other with Tails facing up.
See
~
aldous/RealWorld/coin_tosses.html
for details.
The two students attained 10231 Heads, and 10014 Tails according to the respective starting
conditions. This leads to the following estimates, given separately and combined:
> prop.test(x = 10014, n = 20000)
1sample proportions test with continuity correction
data:
10014 out of 20000, null probability 0.5
Xsquared = 0.0365, df = 1, pvalue = 0.8486
alternative hypothesis: true p is not equal to 0.5
95 percent confidence interval:
0.4937460 0.5076537
sample estimates:
p
0.5007
> prop.test(x = 10231, n = 20000)
1sample proportions test with continuity correction
data:
10231 out of 20000, null probability 0.5
Xsquared = 10.626, df = 1, pvalue = 0.001115
alternative hypothesis: true p is not equal to 0.5
CHAPTER 15.
INFERENCE FOR POPULATION PROPORTIONS
222
95 percent confidence interval:
0.5045958 0.5184998
sample estimates:
p
0.51155
> prop.test(x = 10014 + 10231, n = 40000)
1sample proportions test with continuity correction
data:
10014 + 10231 out of 40000, null probability 0.5
Xsquared = 5.978, df = 1, pvalue = 0.01449
alternative hypothesis: true p is not equal to 0.5
95 percent confidence interval:
0.5012126 0.5110362
sample estimates:
p
0.506125
The pooled estimate of the proportion is ˆ
p
=
50
.
6%, rather close to the value predicted in [3]. In
addition, the hypothesis
H
o
∶
p
=
0
.
5 is rejected with significance level
P
=
0
.
01449.
What sample size is needed to obtain a margin of error of 0
.
5% using a 95% confidence interval?
Methods for solving this type of problem will be discussed in Section 16.2.
∎
15.6
Assumptions
The use of the normal distribution in the probability calculations is justified by the central limit
theorem, although a large sample is needed for this purpose.
To adapt the rule of thumb given
previously for the normal approximation to the binomial, we will require that
n
ˆ
p
≥
5
and
n
(
1

ˆ
p
)
≥
5
where
n
is the sample size.
For the two sample case we simply apply these conditions separately to the two samples.
Chapter 16
Sample Size Estimation for
Confidence Intervals
Sampling design is a crucial part of any study involving the collection of data. The objectives of the
study should be defined in advance, and the sampling scheme designed specifically to meet those
objectives. Cost might be an important feature of the design. We may wish to avoid doing more
sampling than is necessary to achieve the objective.
An important example of sampling design is the determination of an appropriate sample size
for constructing a confidence interval. We next consider this problem for two kinds of confidence
intervals previously studied.
16.1
General Approach to Sample Size Calculations: Normal Ap
proximations
Recall that the confidence interval for a population mean, given population variance
σ
2
is
CI
1

α
=
¯
X
n
±
z
α
/
2
σ
√
n
.