This preview shows page 1. Sign up to view the full content.
Unformatted text preview: a
ﬁnite family of likelihood models, with Θ = {θ : p(·θ) ∈ F } the ﬁnite pa
rameter space. Let y = (y1 , . . . , ym ) be a set of independent samples from
an arbitrary distribution q (·), and θ∗ be as in (18). If p(θ = θ∗ ) > 0, then
p(θ = θ∗ y ) → 1 as m → ∞.
Proof. Consider any θ = θ∗ :
m
p(θy )
p(θ)p(y θ)
p(θ) m
p(yi θ)
= log
log ∗
= log ∗ + log p(yi θ∗ )
p(θ y )
p(θ∗ )p(y θ∗ )
p(θ ) i=1 (19) Consider the sum. By the strong law of large numbers, with probability 1,
m 1 m
p(yi θ)
p(y θ)
→ 1y∼q(y) log
log .
∗)
m i=1
p(y θ∗ )
p(yi θ
Well, 1y∼q(y) log p(y θ)
p(y θ)q (y )
= 1y∼q(y) log
(1 in disguise)
p(y θ∗ )q (y )
p(y θ∗ )
q (y )
q (y )
− log
= 1y∼q(y) log
p(y θ)
p(y θ∗ )
∗
= D(q (·)p(·θ )) − D(q (·)p(·θ))
< 0. Why? By (20), with probability 1, as m → ∞,
m
m
i=1 log p(yi θ)
p(y θ)
→ m1y∼q(y) log
= −∞.
p(y θ∗ )
p(yi θ∗ )
17 (20) Let’s plug back into (19). By supposition, p(θ∗ ) > 0 (and the prior doesn’t
change as m...
View Full
Document
 Spring '12
 CynthiaRudin

Click to edit the document details