Note that the answers are slightly di
ff
erent because in
R
the function states: “this
method does not use the concept of adding 2 successes and 2 failures,”but rather uses
the formulas explicitly described in [the paper]”. Hence we
recommend and encour
age the use of software
. However, the software doesn’t compute one sided so that has
to be done by manually.
Example
3.8
A map and GPS application for a smartphone was tested for accuracy.
The experiment yielded 26 error out of the 74 trials. Find the 90% C.I. for the propor
tion of errors.
Since
n
= 74 and
x
= 26, then ˜
n
= 74 + 4 and ˜
p
= (26 + 2)
/
78 = 0
.
359. Hence the 90%
C.I. for
p
is
0
.
359
∓
z
1

0
.
05

{z
}
1
.
645
r
0
.
359(1

0
.
359)
78
→
(0
.
2696337
,
0
.
4483151)
or in
R:
> library(binom) #may need to first install package
> binom.confint(26,74,0.90,methods="ac")
method
x
n
mean
lower
upper
1 agresticoull 26 74 0.3513514 0.2666357 0.4465532
3.2.2
Large sample hypothesis test
Let
X
be the number of successes in
n
Bernoulli trials with probability of success
p
, then
X
∼
Bin(
n,p
). We know by the the C.L.T. that under certain regularity conditions, then
ˆ
p
∼
N
p,
p
(1

p
)
n
!
.
To test
(i) H
0
:
p
≤
p
0
vs H
a
:
p > p
0
(ii) H
0
:
p
≥
p
0
vs H
a
:
p < p
0
(iii) H
0
:
p
=
p
0
vs H
a
:
p
,
p
0
The test statistic equivalent to the
AgrestiCoull
method is
T .S.
=
˜
p

p
0
q
˜
p
(1

˜
p
)
˜
n
H
0
∼
N
(0
,
1)
Reject the null if
Chapter 3. Inference for One Population
75
(i) pvalue=
P
(
Z
≥
T .S.
)
< α
(ii) pvalue=
P
(
Z
≤
T .S.
)
< α
(iii) pvalue=
P
(

Z
 ≥ 
T .S.

)
< α
Example 3.9
In example
3.8
, if we wished to test whether the proportion of errors is
less than half the time then, H
a
:
p <
0
.
5.
T .S.
=
28
/
78

0
.
5
q
28
/
78(1

28
/
78)
78
=

2
.
596426
with pvalue = 0.00470996
< α
= 0
.
10, so reject the null. In a way, we kind of knew
from the previous C.I. since the upper limit of the interval was 0.4483 which is less
than than 0.5.
R
code
3.3
Multiple methods and (software) functions exists for performing such
inference such as
prop.test
, but this does not perform the AgrestiCoull method.
3.3
Inference for Population Variance
The sample statistic
s
2
is widely used as the point estimate for the population variance
σ
2
,
and similar to the sample mean it varies from sample to sample and has a sampling distri
bution.
Let
X
1
,...,X
n
be i.i.d.
r.v.’s.
We already have some tools that help us determine the
distribution of
¯
X
=
1
n
∑
n
i
=1
X
i
, a function of the r.v.’s, and hence
¯
X
is a r.v. itself and once a
sample is collected a realization
¯
X
= ¯
x
is observed. Similarly, let
S
2
=
1
n

1
n
X
i
=1
(
X
i

¯
X
)
2
be a function of the r.v.’s
X
1
,...,X
n
and hence is a r.v. itself. A realization of this r.v. is the
sample variance
s
2
.
From Lemma
2.9
if
X
1
,...,X
n
are i.i.d.
N
(
μ,σ
) then
(
n

1)
S
2
σ
2
∼
χ
2
n

1
,
76
3.3. Inference for Population Variance
3.3.1
Confidence interval
0
χ
2
distribution
χ
1
α
2
2
α
2
χ
α
2
2
α
2
1
 α
Figure 3.4:
χ
2
distribution and critical value.
Consequently,
1

α
=
P
χ
2
(
α/
2;
n

1)
<
(
n

1)
S
2
σ
2
< χ
2
(1

α/
2;
n

1)
!
=
P
(
n

1)
S
2
χ
2
(1

α/
2;
n

1)
< σ
2
<
(
n

1)
S
2
χ
2
(
α/
2;
n

1)
which implies that on the long run this interval will contain the true population variance
parameter 100(1

α
)% of the time. Thus, the 100(1

α
)% C.I. for
σ
2
is
(
n

1)
s
2
χ
2
(1

α/
2;
n

1)
,
(
n

1)
s
2
χ
2
(
α/
2;
n

1)
.