By varying and we can encode a wide range of possible

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: se we model θ using a Beta prior (we will see later why this is a good idea): θ ∼ Beta(α, β ). The Beta distribution is: Beta(θ; α, β ) = 1 θα−1 (1 − θ)β −1 , B (α, β ) where B (α, β ) is the beta function, and is constant with respect to θ: �1 B (α, β ) = tα−1 (1 − t)β −1 dt. (7) (8) 0 The quantities α and β are parameters of the prior which we are free to set according to our prior belief about θ. By varying α and β , we can encode a wide range of possible beliefs, as is shown in this figure taken from the Wikipedia article on the Beta distribution: 5 © Krishnavedala on Wikipedia. CC BY-SA. This content is excluded from our Creative Commons license. For more information, see ˆ The MAP estimate for θ we can get from the formula for computing θMAP in (6), plugging in the formula for the likelihood we found in part 2, and the definition of the Beta distribution for the prior (7): ˆ θMAP ∈ arg max (log p(y |θ) + log p(θ)) θ = arg max (mH log θ + (m − mH ) log(1 − θ) θ +...
View Full Document

Ask a homework question - tutors are online