NTHU MATH 2820, 2008, Lecture Notes
Ch8, p.33
Theorem 6.4
(TBp. 276)
Notes
1. Let
X
1
, . . . , X
n
be an i.i.d. sample of size
n
from a pdf/pmf
f
(
x

θ
).
I
X
1
,
···
,X
n
(
θ
)
=
E
∂
∂θ
log
n
i
=1
f
(
X
i

θ
)
2
= E
n
i
=1
∂
∂θ
log
f
(
X
i

θ
)
2
=
n
i
=1
E
∂
∂θ
log
f
(
X
i

θ
)
2
+2
i<j
E
∂
∂θ
log
f
(
X
i

θ
)
E
∂
∂θ
log
f
(
X
j

θ
)
=
n
E
∂
∂θ
log
f
(
X
1

θ
)
2
≡
nI
(
θ
)
2.
I
(
θ
) is the Fisher information contained in a sample of size one.
3. The Fisher informations of independent samples are additive.
4. For i.i.d. sample,
I
X
1
,...,X
n
(
θ
) =
nI
(
θ
) = E[
l
0
(
θ
)]
2
=
−
E[
l
00
(
θ
)]
Under appropriate smoothness conditions on
f
,
I
(
θ
)
≡
E
∂
∂θ
log
f
(
X
1

θ
)
2
=
−
E
∂
2
∂θ
2
log
f
(
X
1

θ
)
.
made by ShaoWei Cheng (NTHU, Taiwan)
Ch8, p.34
Proof:
Since
f
(
x

θ
)
dx
= 1 for all
θ
,
0
=
∂
∂θ
f
(
x

θ
)
dx
=
∂
∂θ
f
(
x

θ
)
dx
=
∂
∂θ
log
f
(
x

θ
)
f
(
x

θ
)
dx
0
=
∂
2
∂θ
2
f
(
x

θ
)
dx
=
∂
∂θ
∂
∂θ
log
f
(
x

θ
)
f
(
x

θ
)
dx
=
∂
2
∂θ
2
log
f
(
x

θ
)
f
(
x

θ
)
dx
+
∂
∂θ
log
f
(
x

θ
)
2
f
(
x

θ
)
dx .
(need smoothness of
f
for interchanging integration and di
ff
erentiation.)
Example 6.18
(Fisher information of i.i.d. Bernoulli
B
(
θ
))
Let
X
1
, . . . , X
n
be i.i.d. from Bernoulli distribution
B
(
θ
) (i.e., the pmf of
X
i
is,
θ
x
(1
−
θ
)
1
−
x
,
for
x
∈
{
0
,
1
}
)
,
then
E
(
X
i
) =
θ
and
V ar
(
X
i
) =
θ
(1
−
θ
).
•
For a single observatoin
X
i
, the
fi
rst and second deratives of its log
likelihood are:
log
f
(
x

θ
)
=
x
log
θ
+ (1
−
x
) log(1
−
θ
)
,
∂
log
f
(
x

θ
)
/
∂θ
=
x/
θ
−
(1
−
x
)
/
(1
−
θ
) = (
x
−
θ
)
/
[
θ
(1
−
θ
)]
,
∂
2
log
f
(
x

θ
)
/
∂
2
θ
=
−
x/
θ
2
−
(1
−
x
)
/
(1
−
θ
)
2
.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
NTHU MATH 2820, 2008, Lecture Notes
Ch8, p.35
•
The Fisher information of a single observation, say
X
1
, is
I
(
θ
)
=
E
X
1
−
θ
θ
(1
−
θ
)
2
=
E
[(
X
1
−
θ
)
2
]
θ
2
(1
−
θ
)
2
=
V ar
(
X
1
)
θ
2
(1
−
θ
)
2
=
θ
(1
−
θ
)
θ
2
(1
−
θ
)
2
=
1
θ
(1
−
θ
)
.
I
(
θ
)
=
−
E
−
X
1
θ
2
−
1
−
X
1
(1
−
θ
)
2
=
θ
θ
2
+
1
−
θ
(1
−
θ
)
2
=
1
θ
+
1
1
−
θ
=
1
θ
(1
−
θ
)
.
•
The Fisher information of observations
X
1
, . . ., X
n
is
I
X
1
,
···
,X
n
(
θ
) =
nI
(
θ
) =
n
θ
(1
−
θ
)
.
Notice that
I
X
1
,
···
,X
n
(
θ
)
—
increases when
n
increases,
—
increases when
θ
↓
0 or
θ
↑
1,
—
reaches a minumum 4
n
at
θ
= 0
.
5.
made by ShaoWei Cheng (NTHU, Taiwan)
Ch8, p.36
•
Consider a single observation
Y
∼
Binomial(
n
,
θ
). The pmf of
Y
is
f
(
y

θ
) =
n
y
θ
y
(1
−
θ
)
n
−
y
,
for
y
∈
{
0
,
1
, . . . , n
}
.
—
The second derative of log likelihood is
∂
2
log
f
(
y

θ
)
/
∂
2
θ
=
−
y/
θ
2
−
(
n
−
y
)
/
(1
−
θ
)
2
.
—
The Fisher information of
Y
, is
I
Y
(
θ
)
=
−
E
−
Y
θ
2
−
n
−
Y
(1
−
θ
)
2
=
n
θ
θ
2
+
n
−
n
θ
(1
−
θ
)
2
=
n
θ
(1
−
θ
)
.
—
Note that
I
Y
(
θ
) is the same as
I
X
1
,
···
,X
n
(
θ
)
.
Theorem 6.5
(consistency of MLE, TBp. 275)
Under appropriate smoothness conditions of
f
, the MLE from an i.i.d. sample
is consistent.
Proof (sketch):
Let us denote the true value of
θ
by
θ
0
.
The MLE maximizes
l
(
θ
)
n
=
1
n
n
i
=1
log
f
(
X
i

θ
)
.
The weak law of large numbers implies that
l
(
θ
)
n
P
−→
E
θ
0
[log
f
(
X

θ
)] =
log
f
(
x

θ
)
f
(
x

θ
0
)
dx
as
n
→ ∞
.
This is the end of the preview.
Sign up
to
access the rest of the document.
 Spring '11
 lisa
 Central Limit Theorem, Normal Distribution, Binomial distribution, ShaoWei Cheng

Click to edit the document details