Learning1 - Learning With Bayesian Networks 10/25/10...

Info iconThis preview shows pages 1–14. Sign up to view the full content.

View Full Document Right Arrow Icon
10/25/10 Learning With Bayesian Networks
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
10/25/10 Learning Is Trivial If All Variables Are Observed X Y Z
Background image of page 2
10/25/10 Inference as Learning Suppose you have a coin with an unknown bias, θ ≡ P(head). You flip the coin multiple times and observe the outcome. From observations, you can infer the bias of the coin This is learning. This is inference.
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
10/25/10 independent events Sufficient statistics
Background image of page 4
10/25/10 Simple Case: Coin Flipping With Discrete Hypotheses § Two hypotheses h 0 : θ=.5 h 1 : θ=.9 § Role of priors diminishes as number of flips increases § Note weirdness that each hypothesis has an associated probability, and each hypothesis specifies a probability probabilities of probabilities! § Setting prior to zero -> narrowing hypothesis space
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
10/25/10
Background image of page 6
10/25/10 Infinite Hypothesis Spaces Consider all values of θ, 0 <= θ <= 1 Inferring θ is just like any other sort of Bayesian inference Likelihood is as before: With uniform priors on θ and we obtain This is a beta distribution: Beta(NH+1, NT+1)
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
10/25/10 Beta Distribution Beta(α,β)
Background image of page 8
10/25/10 Incorporating Priors Suppose we have the prior distribution p(θ) = Beta(VH, VT) Then posterior distribution is also Beta p(θ|d) = Beta(VH+NH, VT+NT) Example of conjugate priors Beta distribution is the conjugate prior for a binomial or Bernoulli likelihood
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
10/25/10
Background image of page 10
10/25/10
Background image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
10/25/10 Prior Distribution For θ Direct assessment Parametric distributions Conjugate distribution Mixtures of conjugate distributions
Background image of page 12
10/25/10 ALL BLACK SLIDES STOLEN FROM DAVID
Background image of page 13

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 14
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 33

Learning1 - Learning With Bayesian Networks 10/25/10...

This preview shows document pages 1 - 14. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online