Unformatted text preview: od
model, maximum likelihood is guaranteed to ﬁnd the correct distribution, as
m goes to inﬁnity. In proving consistency, we do not get ﬁnite sample guar
antees like with statistical learning theory; and data are always ﬁnite.
Coin Flip Example Part 2. Returning to the coin ﬂip example, equation
(2), the loglikelihood is
R(θ) = mH log θ + (m − mH ) log(1 − θ).
We can maximize this by diﬀerentiating and setting to zero, and doing a few
lines of algebra:
dR(θ) ˆ
dθ θ
ˆ
mH (1 − θML )
ˆ
mH − θML mH
ˆ
θML
0 = m H m − mH
−
θ
1−θ
ˆ
= (m − mH )θML
ˆ
ˆ
= mθML − θML mH
mH
=
. m
= ˆ
θ (5) (It turns out not to be diﬃcult to verify that this is indeed a maximum).
In this case, the maximum likelihood estimate is exactly what we intuitively
thought we should do: estimate θ as the observed proportion of Heads. 4 2.2 Maximum a p osteriori (MAP) estimation The MAP estimate is a pointwise estimate with a Bayesian ﬂavor. Rather
than ﬁnding θ tha...
View
Full
Document
This note was uploaded on 03/24/2014 for the course MIT 15.097 taught by Professor Cynthiarudin during the Spring '12 term at MIT.
 Spring '12
 CynthiaRudin

Click to edit the document details