This preview shows page 1. Sign up to view the full content.
Unformatted text preview: (α − 1) log θ + (β − 1) log(1 − θ) − log B (α, β )) .
ˆ
Diﬀerentiating and setting to zero at θMAP ,
mH
m − mH
α−1
β−1
−
+
−
=0
ˆ
ˆ
ˆ
ˆ
θMAP 1 − θMAP
1 − θMAP
θMAP
mH + α − 1
ˆ
.
θMAP =
m+β−1+α−1 (9) This is a very nice result illustrating some interesting properties of the MAP
estimate. In particular, comparing the MAP estimate in (9) to the ML esti
mate in (5) which was
mH
ˆ
,
θML =
m
we see that the MAP estimate is equivalent to the ML estimate of a data set
with α − 1 additional Heads and β − 1 additional Tails. When we specify,
for example, a prior of α = 7 and β = 3, it is literally as if we had begun the
6 coin tossing experiment with 6 Heads and 2 Tails on the record. If we truly
believed before we started ﬂipping coins that the probability of Heads was
around 6/8, then this is a good idea. This can be very useful in reducing the
variance of the estimate for small samples.
For example, suppose the data contain only one coin ﬂip, a Heads. The ML
ˆ
estimate wil...
View
Full
Document
This note was uploaded on 03/24/2014 for the course MIT 15.097 taught by Professor Cynthiarudin during the Spring '12 term at MIT.
 Spring '12
 CynthiaRudin

Click to edit the document details