Statistical Inference for FE
Professor S. Kou, Department of IEOR, Columbia University
Lecture 2. Properties of MLE and Estimation for Geometric Brownian
Motion
1
Properties of MLE’s
MLE is perhaps the most widely used estimation procedure, mainly because
it leads to estimators with nice properties.
Under certain regularity conditions, the MLE should have the following
properties.
(1) Consistency:
ˆ
θ
→
θ
in probability.
(2) Asymptotic Normality
ˆ
θ
≈
N
(
θ
, I
(
θ
)
−
1
)
,
in distribution, where the Fisher information matrix
I
(
θ
)
is given by
I
(
θ
) =
−
E
(
∂
2
log
L
∂θ∂θ
T
)
.
(3) It can shown the variance of the MLE is asymptotically smallest
possible (i.e.
achieve the CramerRao lower bound).
More precisely, no
unbiased estimators can have smaller variance than that of the MLE, at
least asymptotically.
(4) Invariance. The MLE of
C
(
θ
)
is given by
C
(
ˆ
θ
)
, if the function
C
(
·
)
is
continuously di
ff
erentiable. For example, if we want to estimate the Sharpe
ratio
(
μ
−
r
)
/
σ
, where
r
is the riskfree rate, we can simply use
(ˆ
μ
−
r
)
/
ˆ
σ
.
2
Estimation of Asymptotic Variances of MLE’s
In the case of the MLE, we have the asymptotic variance is given by
I
(
θ
)
−
1
=
Ã
−
E
"
∂
2
log
L
(
θ
)
∂θ∂θ
T
#!
−
1
.
However, since
θ
is unknown, we have to estimate
I
(
θ
)
−
1
. There are three
ways to do that.
(1) A natural estimator of the asymptotic variance
I
(
θ
)
−
1
is given by
V
A
=
I
(
ˆ
θ
)
−
1
.
Of course, this approach depends on the assumption that
I
(
θ
)
−
1
can be
computed easily or an analytical form of it is readily available. In general,
computation of
I
(
θ
)
−
1
requires (possible highly dimensional) integration and
inversion, with the highest possible dimension being the sample size.
(2) If
I
(
θ
)
−
1
cannot be computed easily, then we can also estimate the
asymptotic variance as
V
E
(
ˆ
θ
)
where
V
E
(
θ
) =
−
Ã
∂
2
log
L
(
θ
)
∂θ∂θ
T
!
−
1
.
1
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
In other words, we simply ignore the expectation. This works if the sample
size is large enough and if
V
E
(
θ
)
can be written a sum of independent random
variables, so that the law of large numbers applies.
(3) Alternatively, we can use the BHHH estimator (Bernt, Hall, Hall,
Hausman, 1974, Annals of Economic and Social Measurements). The idea of
BHHH estimator is based on the observation that under suitable conditions
V ar
(
g
i
) =
−
E
"
∂
2
log
f
(
X
i
,
θ
)
∂θ∂θ
T
#
=
I
(
θ
)
, E
[
g
i
] = 0
,
(1)
where
g
i
=
∂
log
f
(
X
i
,
θ
)
∂θ
.
In the appendix, we shall derive (1).
Now suppose that the samples are
independent (but not necessarily identically distributed). Then
−
E
"
∂
2
log
L
(
θ
)
∂θ∂θ
T
#
=
−
n
X
i
=1
E
"
∂
2
log
f
(
X
i
,
θ
)
∂θ∂θ
T
#
=
n
X
i
=1
V ar
(
g
i
)
.
The question becomes how to estimate
V ar
(
g
i
)
based on one observation
X
i
. Since
E
[
g
i
] = 0
, then a natural way to do this is to use
g
2
i
, resulting in
the BHHH estimator
V
BHHH
=
"
n
X
i
=1
g
i
g
T
i
#
−
1
,
g
i
=
∂
log
f
(
X
i
,
θ
)
∂θ
,
where
n
is the sample size.
This is the end of the preview.
Sign up
to
access the rest of the document.
 Spring '10
 kou
 Normal Distribution, Variance, Brownian Motion, probability density function, Maximum likelihood, Estimation theory

Click to edit the document details