This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: STAT 531: Bayesian Methods HM Kim Department of Mathematics and Statistics University of Calgary Fall 2010 1/47 Bayes’s theorem Bayess theorem relates to the problem of adjudicating between competing hypotheses given observations. Suppose D is an event, i.e. something that either happens or that doesn’t. Suppose C 1 , C 2 , ··· , C J are other events that form a partition. C 1 , C 2 , ··· , C J can be thought of as competing hypotheses to explain the event observed, D . Fall 2010 2/47 In that case, the conditional probability of C j given D is p ( C j  D ) = p ( D  C j ) p ( C j ) p ( D ) To calculate p ( D ), we may need a further result, the Law of Total Probability. , → Law of Total Probability : The overall, or marginal probability of the event, D , can be expressed in terms of the probabilities of C j and the conditional probabilities of D given each of the C j s, as follows: p ( D ) = J ∑ j =1 p ( D  C j ) p ( C j ) . Fall 2010 3/47 , → Example : An item is produced in 3 different factories, C 1 , C 2 , C 3 . The proportions produced in the 3 factories, and the proportions defective in each, are as follows: factory % produced % defective C 1 50 2 C 2 30 3 C 3 20 4 An item is purchased and found to be defective. This is event D . What is the probability that it was from factory C 1 ? Fall 2010 4/47 First, we find the overall probability of a defective, p ( D ), from the Law of Total Probability: p ( D ) = p ( D  C 1 ) p ( C 1 )+ p ( D  C 2 ) p ( C 2 )+ p ( D  C 3 ) p ( C 3 ) = 0 . 02 × . 5+0 . 03 × . 3+0 . 02 × . 2 = 0 . 027 Then, Bayess theorem tells us the probability that the item was from factory C 1 : p ( C 1  D ) = p ( D  C 1 ) p ( C 1 ) p ( D ) = . 02 × . 5 . 027 = 0 . 37 Fall 2010 5/47 , → Another version : p ( C j  D ) = p ( D  C j ) p ( C j ) p ( D ) = ⇒ p ( C j  D ) ∝ p ( D  C j ) p ( C j ) To implement this, we calculate p ( D  C j ) p ( C j ) for each j , add them up, and then divide by the sum so that they for each add up to 1 (which they have to, because theyre probability of a partition). p ( C 1  D ) ∝ p ( D  C 1 ) p ( C 1 ) = 0 . 02 × . 5 = 0 . 010 p ( C 2  D ) ∝ p ( D  C 2 ) p ( C 2 ) = 0 . 03 × . 3 = 0 . 009 p ( C 3  D ) ∝ p ( D  C 3 ) p ( C 3 ) = 0 . 04 × . 2 = 0 . 008 Then p ( C 1  D ) = p ( D  C 1 ) p ( C 1 ) . 010+0 . 009+0 . 008 = . 010 . 027 = 0 . 37 Fall 2010 6/47 Another way of looking at this is that C 1 , C 2 and C 3 are the possible states of nature (parameters) D is the data (datum) We then use the data to decide how likely the different states of nature are relative to one another. This is the idea that underlies Bayesian statistics. , → Likelihood, prior and posterior probabilities : p ( D  C j ) is the probability of the data given the state of nature C j ....
View
Full Document
 Spring '11
 GABORLUKACS
 Statistics, Probability theory, Bayesian probability, θ, posterior distribution

Click to edit the document details