Estimation

# Estimation - CSC 411 CSC D11 Estimation 7 Estimation We now...

This preview shows pages 1–3. Sign up to view the full content.

CSC 411 / CSC D11 Estimation 7 Estimation We now consider the problem of determining unknown parameters of the world based on mea- surements. The general problem is one of inference , which describes the probabilities of these unknown parameters. Given a model, these probabilities can be derived using Bayes’ Rule. The simplest use of these probabilities is to perform estimation , in which we attempt to come up with single “best” estimates of the unknown parameters. 7.1 Learning a binomial distribution For a simple example, we return to coin-flipping. We flip a coin N times, with the result of the i -th flip denoted by a variable c i : “ c i = heads ” means that the i -th flip came up heads. The probability that the coin lands heads on any given trial is given by a parameter θ . We have no prior knowledge as to the value of θ , and so our prior distribution on θ is uniform. 1 In other words, we describe θ as coming from a uniform distribution from 0 to 1, so p ( θ ) = 1 ; we believe that all values of θ are equally likely if we have not seen any data. We further assume that the individual coin flips are independent, i.e., P ( c 1: N | θ ) = p i p ( c i | θ ) . (The notation “ c 1: N ” indicates the set of observations { c 1 ,..., c N } .) We can summarize this model as follows: Model: Coin-Flipping θ ∼ U (0 , 1) P ( c = heads ) = θ P ( c 1: N | θ ) = p i p ( c i | θ ) (1) Suppose we wish to learn about a coin by flipping it 1000 times and observing the results c 1:1000 , where the coin landed heads 750 times? What is our belief about θ , given this data? We now need to solve for p ( θ | c 1:1000 ) , i.e., our belief about θ after seeing the 1000 coin flips. To do this, we apply the basic rules of probability theory, beginning with the Product Rule: P ( c 1:1000 ) = P ( c 1:1000 | θ ) p ( θ ) = p ( θ | c 1:1000 ) P ( c 1:1000 ) (2) Solving for the desired quantity gives: p ( θ | c 1:1000 ) = P ( c 1:1000 | θ ) p ( θ ) P ( c 1:1000 ) (3) The numerator may be written using P ( c 1:1000 | θ ) p ( θ ) = P i P ( c i | θ ) = θ 750 (1 θ ) 1000 - 750 (4) 1 We would usually expect a coin to be fair, i.e., the prior distribution for θ is peaked near 0 . 5 . Copyright c c 2009 Aaron Hertzmann and David Fleet 35

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
CSC 411 / CSC D11 Estimation 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 H=1,T=0 H=750,T=250 Figure 1: Posterior probability of θ from two different experiments: one with a single coin flip (landing heads), and 1000 coin flips (750 of which land heads). Note that the latter distribution is much more peaked. (Note: the vertical scale is wrong on these plots, they should integrate to 1.) The denominator may be solved for by the marginalization rule: P ( c 1:1000 ) = i 1 0 P ( c 1:1000 ) = i 1 0 θ 750 (1 θ ) 1000 - 750 = Z (5) where Z is a constant (evaluating it requires more advanced math, but it is not necessary for our
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 6

Estimation - CSC 411 CSC D11 Estimation 7 Estimation We now...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online