lec04-BayesianLearning

# lec04-BayesianLearning - Bayesian Learning Features of...

This preview shows pages 1–5. Sign up to view the full content.

CS464 Introduction to Machine Learning 1 Bayesian Learning Features of Bayesian learning methods: Each observed training example can incrementally decrease or increase the estimated probability that a hypothesis is correct. This provides a more flexible approach to learning than algorithms that completely eliminate a hypothesis if it is found to be inconsistent with any single example. Prior knowledge can be combined with observed data to determine the final probability of a hypothesis. In Bayesian learning, prior knowledge is provided by asserting a prior probability for each candidate hypothesis, and a probability distribution over observed data for each possible hypothesis . Bayesian methods can accommodate hypotheses that make probabilistic predictions New instances can be classified by combining the predictions of multiple hypotheses, weighted by their probabilities. Even in cases where Bayesian methods prove computationally intractable, they can provide a standard of optimal decision making against which other practical methods can be measured.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
CS464 Introduction to Machine Learning 2 Difficulties with Bayesian Methods Require initial knowledge of many probabilities When these probabilities are not known in advance they are often estimated based on background knowledge, previously available data, and assumptions about the form of the underlying distributions. Significant computational cost is required to determine the Bayes optimal hypothesis in the general case (linear in the number of candidate hypotheses). In certain specialized situations, this computational cost can be significantly reduced.
CS464 Introduction to Machine Learning 3 Bayes Theorem In machine learning, we try to determine the best hypothesis from some hypothesis space H, given the observed training data D. In Bayesian learning, the best hypothesis means the most probable hypothesis, given the data D plus any initial knowledge about the prior probabilities of the various hypotheses in H. Bayes theorem provides a way to calculate the probability of a hypothesis based on its prior probability, the probabilities of observing various data given the hypothesis, and the observed data itself.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
CS464 Introduction to Machine Learning 4 Bayes Theorem P(h) is prior probability of hypothesis h P(h) to denote the initial probability that hypothesis h holds, before observing training data. P(h) may reflect any background knowledge we have about the chance that h is correct. If we have no such prior knowledge, then each candidate hypothesis might simply get the same prior probability. P(D) is prior probability of training data D The probability of D given no knowledge about which hypothesis holds P(h|D) is posterior probability of h given D P(h|D) is called the posterior probability of h, because it reflects our confidence that h holds after we have seen the training data D. The posterior probability P(h|D)
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 39

lec04-BayesianLearning - Bayesian Learning Features of...

This preview shows document pages 1 - 5. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online