NTHU MATH 2820, 2008, Lecture Notes
Ch8, p.35
•
The Fisher information of a single observation, say
X
1
,is
I
(
θ
)=
E
X
1
−
θ
θ
(1
−
θ
)
2
=
E
[(
X
1
−
θ
)
2
]
θ
2
(1
−
θ
)
2
=
Var
(
X
1
)
θ
2
(1
−
θ
)
2
=
θ
(1
−
θ
)
θ
2
(1
−
θ
)
2
=
1
θ
(1
−
θ
)
.
I
(
θ
−
E
−
X
1
θ
2
−
1
−
X
1
(1
−
θ
)
2
=
θ
θ
2
+
1
−
θ
(1
−
θ
)
2
=
1
θ
+
1
1
−
θ
=
1
θ
(1
−
θ
)
.
•
The Fisher information of observations
X
1
,...,X
n
is
I
X
1
,
···
,X
n
(
θ
nI
(
θ
n
θ
(1
−
θ
)
.
Notice that
I
X
1
,
···
,X
n
(
θ
)
—
increases when
n
increases,
—
increases when
θ
↓
0or
θ
↑
1,
—
reaches a minumum 4
n
at
θ
=0
.
5.
made by ShaoWei Cheng (NTHU, Taiwan)
Ch8, p.36
•
Consider a single observation
Y
∼
Binomial(
n
,
θ
). The pmf of
Y
is
f
(
y

θ
n
y
θ
y
(1
−
θ
)
n
−
y
,
for
y
∈
{
0
,
1
,...,n
}
.
—
The second derative of log likelihood is
∂
2
log
f
(
y

θ
)
/
∂
2
θ
=
−
y/
θ
2
−
(
n
−
y
)
/
(1
−
θ
)
2
.
—
The Fisher information of
Y
I
Y
(
θ
−
E
−
Y
θ
2
−
n
−
Y
(1
−
θ
)
2
=
n
θ
θ
2
+
n
−
n
θ
(1
−
θ
)
2
=
n
θ
(1
−
θ
)
.
—
Note that
I
Y
(
θ
)isthesameas
I
X
1
,
···
,X
n
(
θ
)
.
Theorem 6.5
(consistency of MLE, TBp. 275)
Under appropriate smoothness conditions of
f
, the MLE from an i.i.d. sample
is consistent.
Proof (sketch):
Letusdenotethetruevalueof
θ
by
θ
0
.
The MLE maximizes
l
(
θ
)
n
=
1
n
n
i
=1
log
f
(
X
i

θ
)
.
The weak law of large numbers implies that
l
(
θ
)
n
P
−→
E
θ
0
[log
f
(
X

θ
)] =
log
f
(
x

θ
)
f
(
x

θ
0
)
dx
as
n
→∞
.
[log
f
(
x

θ
)]
f
(
x

θ
0
)
dx
and
[log
f
(
x

θ
)]
f
(
x

θ
)
dx
?