Statistical Inference for FE
Professor S. Kou, Department of IEOR, Columbia University
Lecture 5. Bayesian Inference
1I
n
t
r
o
d
u
c
t
i
o
n
So far we have focused on point estimation and hypothesis testing via fre
quentist methods. The frequentist method is based on the following assump
tions.
F1. Probability is equivalent to limiting relative frequencies, and hence
are objective properties.
F2. Unknown parameters are
f
xed, deterministic constants.
F3. Inference procedures should always be interpreted via long run aver
ages. For example, a 95% c.i. should has a 95% limiting coverage frequency
if we repeat the same procedure many times.
However, there is a di
f
erent school of statistical inference, called Bayesian
inference, which is based on the following assumptions.
B1. Probability relates to degree of belief, not frequencies. With this
interpretation, we can have wider applications of probability. For example,
we can say that with probability 0.55 an apple from a tree did drop to the
head of Isaac Newton. This statement re
F
ects a subjective belief, not a
limiting frequency.
B2. Unknown parameters are uncertain and therefore can be modeled
as random variables.
B3. Inference means giving a updated prediction about the distribution
of the unknown parameters.
Clearly the Bayesian approach is subjective; this attributes to the popu
larity of the frequentist approach, as people in general likes objective meth
ods. However, in
f
nance, there is a growing support of Bayesian approach,
mainly because estimation in many
f
nancial problems are very hard with
out Bayesian approaches. For example, since it is very di
ﬃ
cult to estimate
the true unknown returns of stocks, it makes sense to model the returns as
random variables, and use Bayesian approaches to give estimators of returns
by combining subjective views of the traders and the empirical data.
This is the view adapted in BlackLitterman’s asset allocation method,
which we will cover later.
2T
h
e
B
a
y
e
s
i
a
n
M
e
t
h
o
d
The Bayesian inference requires two inputs.
1. A prior distribution
f
(
θ
)
about the unknown parameter
θ
before we
see the data.
1
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document2. A likelihood function, or a model, that links
θ
and the data
X
=
(
X
1
,...,X
n
)
. More precisely, we need to specify the conditional joint density
f
(
X

θ
)
.
After this, we can calculate the posterior distribution of
f
(
θ

X
)
by using
Bayes’ formula from elementary probability. More precisely,
f
(
θ

X
)=
f
(
θ
,X
)
f
(
X
)
=
f
(
X

θ
)
f
(
θ
)
R
f
(
θ
,X
)
d
θ
=
f
(
X

θ
)
f
(
θ
)
R
f
(
X

θ
)
f
(
θ
)
d
θ
.
We can write this as
f
(
θ

X
)
∝
f
(
X

θ
)
f
(
θ
)
,
because the normalizing constant
C
(
X
)=
Z
f
(
X

θ
)
f
(
θ
)
d
θ
does not depend on
θ
. Typical
C
(
X
)
can be either got analytically (by using
the fact that total probability must be one), or by numerical integration.
A recent revolution in Bayesian analysis is that very often we do not need
This is the end of the preview.
Sign up
to
access the rest of the document.
 Spring '10
 kou
 Bayesian statistics, Jeffreys, Bayesian estimator

Click to edit the document details