ISyE8843A, Brani Vidakovic
Handout 3
1
Ingredients of Bayesian Inference
•
The
model
for a typical observation
X
conditional on unknown parameter
θ
is
f
(
x

θ
)
. As a function of
θ
,
f
(
x

θ
) =
‘
(
θ
)
is called
likelihood.
The functional form of
f
is fully speciﬁed up to a parameter
θ.
•
The parameter
θ
is supported by the parameter space
Θ
and considered a random variable. The random
variable
θ
has a distribution
π
(
θ
)
that is called the
prior.
•
If the prior for
θ
is speciﬁed up to a parameter
τ
,
π
(
θ

τ
)
,
τ
is called a
hyperparameter.
•
The distribution
h
(
x,θ
) =
f
(
x

θ
)
π
(
θ
)
is called the
joint
distribution for
X
and
θ
.
•
The joint distribution can also be factorized as,
h
(
x,θ
) =
π
(
θ

x
)
m
(
x
)
.
•
The distribution
π
(
θ

x
)
is called the
posterior
distribution for
θ
, given
X
=
x
.
•
The
marginal
distribution
m
(
x
)
can be obtained by integrating out
θ
from the joint distribution
h
(
x,θ
)
,
m
(
x
) =
Z
Θ
h
(
x,θ
)
dθ
=
Z
Θ
f
(
x

θ
)
π
(
θ
)
dθ.
•
Therefore, the posterior
π
(
θ

x
)
can be expressed as
π
(
θ

x
) =
h
(
x,θ
)
m
(
x
)
=
f
(
x

θ
)
π
(
θ
)
m
(
x
)
=
f
(
x

θ
)
π
(
θ
)
R
Θ
f
(
x

θ
)
π
(
θ
)
dθ
.
•
Suppose
Y
∼
f
(
y

θ
)
is to be observed. The (posterior) predictive distribution of
Y
, given observed
X
=
x
is
f
(
y

x
) =
Z
Θ
f
(
y

θ
)
π
(
θ

x
)
dθ.
The marginal distribution
m
(
y
) =
R
Θ
f
(
y

θ
)
π
(
θ
)
dθ
is sometimes called the prior predictive distribution.
name
notation
equal to
model, likelihood
f
(
x

θ
)
prior
π
(
θ
)
joint
h
(
x,θ
)
f
(
x

θ
)
π
(
θ
)
marginal
m
(
x
)
R
Θ
f
(
x

θ
)
π
(
θ
)
dθ
posterior
π
(
θ

x
)
f
(
x

θ
)
π
(
θ
)
/m
(
x
)
predictive
f
(
y

x
)
R
Θ
f
(
y

θ
)
π
(
θ

x
)
dθ
Example 1:
Suppose that the likelihood (model) for
X
given
θ
is binomial
B
(
n,θ
)
, i.e.,
f
(
x

θ
) =
±
n
x
¶
θ
x
(1

θ
)
n

x
, x
= 0
,
1
,...,n,
and the prior is beta,
B
e
(
α,β
)
, where the hyperparameters
α
and
β
are known,
π
(
θ
) =
1
B
(
α,β
)
θ
α

1
(1

θ
)
β

1
,
0
≤
θ
≤
1
.
1