MIT15_097S12_lec15

# Again it is generally more convenient to work with

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: t maximizes the likelihood function, p(y |θ), we ﬁnd θ that maximizes the posterior, p(θ|y ). The distinction is between the θ under which the data are most likely, and the most likely θ given the data. We don’t have to worry about evaluating the partition function p(y |θ' )p(θ' )dθ' because it is constant with respect to θ. Again it is generally more convenient to work with the logarithm. p(y |θ)p(θ) θ θ p(y |θ' )p(θ' )dθ' = arg max p(y |θ)p(θ) = arg max (log p(y |θ) + log p(θ)) . ˆ θMAP ∈ arg max p(θ|y ) = arg max θ θ (6) When the prior is uniform, the MAP estimate is identical to the ML estimate because the log p(θ) is constant. One might ask what would be a bad choice for a prior. We will see later that reasonable choices of the prior are those that do not assign zero probability to the true value of θ. If we have such a prior, the MAP estimate is consistent, which we will discuss in more detail later. Some other properties of the MAP estimate are illustrated in the next example. Coin Flip Example Part 3. We again return to the coin ﬂip example. Suppo...
View Full Document

Ask a homework question - tutors are online