MIT15_097S12_lec15

By varying and we can encode a wide range of possible

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: se we model θ using a Beta prior (we will see later why this is a good idea): θ ∼ Beta(α, β ). The Beta distribution is: Beta(θ; α, β ) = 1 θα−1 (1 − θ)β −1 , B (α, β ) where B (α, β ) is the beta function, and is constant with respect to θ: �1 B (α, β ) = tα−1 (1 − t)β −1 dt. (7) (8) 0 The quantities α and β are parameters of the prior which we are free to set according to our prior belief about θ. By varying α and β , we can encode a wide range of possible beliefs, as is shown in this ﬁgure taken from the Wikipedia article on the Beta distribution: 5 © Krishnavedala on Wikipedia. CC BY-SA. This content is excluded from our Creative Commons license. For more information, see http://ocw.mit.edu/fairuse. ˆ The MAP estimate for θ we can get from the formula for computing θMAP in (6), plugging in the formula for the likelihood we found in part 2, and the deﬁnition of the Beta distribution for the prior (7): ˆ θMAP ∈ arg max (log p(y |θ) + log p(θ)) θ = arg max (mH log θ + (m − mH ) log(1 − θ) θ +...
View Full Document

Ask a homework question - tutors are online