Statistical Inference for FE
Professor S. Kou, Department of IEOR, Columbia University
Lecture 4. GoodnessofFit Tests
In this lecture we shall study various goodnessof
f
t tests, whose objec
tive is to test whether a model
f
ts data. We shall introduce two general
chisquare tests for discrete observations, and various special tests for hy
pothesis testing on distributions.
1C
h
i

S
q
u
a
r
e
T
e
s
t
s
To understand the motivation of the chisquare test we
f
rst consider a simple
example.
Example 1.
The following table presents the number of trades (buy and
sell stocks) conducted within a week period from 48 individual brokerage
accounts.
#
o
f
t
r
a
d
e
s012 3
≥
4
F
r
e
q
u
e
n
c
y 991
01
46
For example, there are 10 accounts which had made 2 trades during the
week. We want to test whether a Poisson distribution can
f
t the data.
In general, suppose the data has been grouped into
m
categories (or
cells); in Example 1, we have 5 cells. We want to test
H
0
:
p
=
p
(
θ
)
∈
ω
0
,H
a
:
p
6
=
p
(
θ
)
,p
∈
Ω
,
where
θ
a parameter in a given probability distribution,
Ω
=
{
p
:
P
m
i
=1
p
i
=
1
i
≥
0
}
. For example, in Example 1,
p
(
θ
)
represents probabilities accord
ing to a Poisson distribution with an unknown parameter (which is
λ
in the
Poisson distribution).
1.1 Likelihood Ratio Chisquare Test
We shall use the likelihood ratio statistic, which is given by
Λ
=
max
H
0
lik
(
p
(
θ
))
max
H
a
lik
(
p
)
=
max
p
∈
ω
0
lik
(
p
(
θ
))
max
p
∈
Ω
lik
(
p
)
,
where
lik
denotes the likelihood. Since the likelihood for the
m
ce
llsisg
iven
by
n
!
X
1
!
···
X
m
!
p
X
1
1
p
X
2
2
p
X
m
m
,
we have
Λ
=
max
θ
∈
ω
0
n
!
X
1
!
···
X
m
!
p
1
(
θ
)
X
1
p
2
(
θ
)
X
2
p
m
(
θ
)
X
m
max
p
∈
Ω
n
!
X
1
!
···
X
m
!
p
X
1
1
p
X
2
2
p
X
m
m
.
The denominator is maximized at
ˆ
p
i
=
X
i
n
.
1
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentFor the numerator, let
ˆ
θ
be the maximizer of the numerator. Thus, the
likelihood ratio test statistics is
Λ
=
p
1
(
ˆ
θ
)
X
1
p
2
(
ˆ
θ
)
X
2
···
p
m
(
ˆ
θ
)
X
m
ˆ
p
X
1
1
ˆ
p
X
2
2
ˆ
p
X
m
m
=
m
Y
i
=1
Ã
p
i
(
ˆ
θ
)
ˆ
p
i
!
X
i
,
and
−
2log
Λ
=
−
2
m
X
i
=1
X
i
log
Ã
p
i
(
ˆ
θ
)
ˆ
p
i
!
.
Denote
O
i
=
X
i
the observed cell counts,
E
i
=
np
i
(
ˆ
θ
)
the expected cell
counts under
H
0
. Then we know from the general theory of likelihood ratio
test that under
H
0
,
−
Λ
=2
m
X
i
=1
O
i
log
μ
O
i
E
i
¶
approximately has
χ
2
distribution with d.f.
d.f.
=dim(
Ω
)
−
dim(
H
0
)=
m
−
1
−
dim(
H
0
)
.
1.2 Pearson’s
χ
2
Test
To simplify the likelihood test, note the Taylor series
x
log(
x
x
0
)=(
x
−
x
0
)+
1
2
(
x
−
x
0
)
2
x
0
+
Therefore, we also have
−
Λ
m
X
i
=1
O
i
log
μ
O
i
E
i
¶
m
X
i
=1
"
(
O
i
−
E
i
1
2
(
O
i
−
E
i
)
2
E
i
+
#
≈
2
m
X
i
=1
"
(
O
i
−
E
i
1
2
(
O
i
−
E
i
)
2
E
i
#
=
m
X
i
=1
(
O
i
−
E
i
)
2
E
i
.
This is called Pearson’s
χ
2
test. The likelihood ratio
χ
2
test is approxi
mately equivalent to Pearson’s
χ
2
test. In particular, Pearson’s
χ
2
statistic
is approximately
χ
2
with
df
=
m
−
1
−
dim(
H
0
)
Remarks:
(i) In deriving the likelihood ratio
χ
2
test and Pearson’s
χ
2
test we assume that the unknown parameter
θ
is estimated by the MLE.
This is the end of the preview.
Sign up
to
access the rest of the document.
 Spring '10
 kou
 Normal Distribution, The Land, likelihood ratio

Click to edit the document details