Unformatted text preview: se we model θ using a Beta prior (we will see later why this is a good
idea): θ ∼ Beta(α, β ). The Beta distribution is:
Beta(θ; α, β ) = 1
θα−1 (1 − θ)β −1 ,
B (α, β ) where B (α, β ) is the beta function, and is constant with respect to θ:
�1
B (α, β ) =
tα−1 (1 − t)β −1 dt. (7) (8) 0 The quantities α and β are parameters of the prior which we are free to set
according to our prior belief about θ. By varying α and β , we can encode
a wide range of possible beliefs, as is shown in this ﬁgure taken from the
Wikipedia article on the Beta distribution:
5 © Krishnavedala on Wikipedia. CC BYSA. This content is excluded from our Creative
Commons license. For more information, see http://ocw.mit.edu/fairuse. ˆ
The MAP estimate for θ we can get from the formula for computing θMAP
in (6), plugging in the formula for the likelihood we found in part 2, and the
deﬁnition of the Beta distribution for the prior (7):
ˆ
θMAP ∈ arg max (log p(y θ) + log p(θ))
θ = arg max (mH log θ + (m − mH ) log(1 − θ)
θ +...
View
Full
Document
 Spring '12
 CynthiaRudin

Click to edit the document details