MIT15_097S12_lec15

# The prior is the knowledge part one could interpret

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: l be θM L = 1, which predicts that we will never ﬂip tails! How­ ever, we, the modeler, suspect that the coin is probably fair, and can assign ˆ α = β = 3 (or some other number with α = β ), and we get θM AP = 3/5. Question How would you set α and β for the coin toss under a strong prior belief vs. a weak prior belief that the probability of Heads was 1/8? For large samples it is easy to see for the coin ﬂipping that the eﬀect of the prior goes to zero: ˆ ˆ lim θMAP = lim θML = θtrue . m→∞ m→∞ Why? Recall what know about regularization in machine learning - that data plus knowledge implies generalization. The prior is the “knowledge” part. One could interpret the MAP estimate as a regularized version of the ML estimate, or a version with “shrinkage.” Example 1. (Rare Events) The MAP estimate is particularly useful when dealing with rare events. Suppose we are trying to estimate the probabil­ ity that a given credit card transaction is fraudulent. Perhaps we...
View Full Document

## This note was uploaded on 03/24/2014 for the course MIT 15.097 taught by Professor Cynthiarudin during the Spring '12 term at MIT.

Ask a homework question - tutors are online