This preview shows page 1. Sign up to view the full content.
Unformatted text preview: ).
We will assume that the data were generated from a probability distribution
that is described by some parameters θ (not necessarily scalar). We treat θ
as a random variable. We will use the shorthand notation p(y θ) to represent
the family of conditional density functions over y , parameterized by the ran
dom variable θ. We call this family p(y θ) a likelihood function or likelihood
model for the data y , as it tells us how likely the data y are given the model
speciﬁed by any value of θ.
We specify a prior distribution over θ, denoted p(θ). This distribution rep
resents any knowledge we have about how the data are generated prior to
1 observing them. Our end goal is the conditional density function over θ, given the observed
data, which we denote as p(θy ). We call this the posterior distribution, and
it informs us which parameters are likely given the observed data.
We, the modeler, specify the likelihood function (as a function of y and θ)
and the prior (we completely specify this) using our know...
View
Full
Document
This note was uploaded on 03/24/2014 for the course MIT 15.097 taught by Professor Cynthiarudin during the Spring '12 term at MIT.
 Spring '12
 CynthiaRudin

Click to edit the document details