This preview shows page 1. Sign up to view the full content.
Unformatted text preview: the whole theory would look nicer if it were built from the start
without reference to Bayesianism and priors.” Nevertheless, recent advances
in theory and particularly in computation have shown Bayesian statistics to
be very useful for many applications. Machine learning is concerned mainly
with prediction ability. A lot of the methods we discussed do not worry about
exactly what the underlying distribution is  as long as we can predict, we are
happy, regardless of whether we even have a meaningful estimate for p(y θ). 2 Point estimates Rather than estimate the entire distribution p(θy ), sometimes it is suﬃcient
to ﬁnd a single ‘good’ value for θ. We call this a point estimate. For the sake
of completeness, we will brieﬂy discuss two widely used point estimates, the
maximum likelihood (ML) estimate and the maximum a posteriori (MAP)
estimate.
2.1 Maximum likelihood estimation ˆ
The ML estimate for θ is denoted θML and is the value for θ under which the
data are most likely:
ˆ
θML ∈ arg max p(y θ)....
View
Full
Document
This note was uploaded on 03/24/2014 for the course MIT 15.097 taught by Professor Cynthiarudin during the Spring '12 term at MIT.
 Spring '12
 CynthiaRudin

Click to edit the document details