This preview shows pages 1–2. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: CS 70-2 Discrete Mathematics and Probability Theory Spring 2009 Alistair Sinclair, David Tse Lecture 19 I.I.D. Random Variables Estimating the bias of a coin Question: We want to estimate the proportion p of Democrats in the US population, by taking a small random sample. How large does our sample have to be to guarantee that our estimate will be within (say) 10% (in relative terms) of the true value with probability at least 0.95? This is perhaps the most basic statistical estimation problem, and shows up everywhere. We will develop a simple solution that uses only Chebyshevs inequality. More refined methods can be used to get sharper results. Lets denote the size of our sample by n (to be determined), and the number of Democrats in it by the random variable S n . (The subscript n just reminds us that the r.v. depends on the size of the sample.) Then our estimate will be the value A n = 1 n S n . Now as has often been the case, we will find it helpful to write S n = X 1 + X 2 + + X n , where X i = braceleftBigg 1 if person i in sample is a Democrat; otherwise. Note that each X i can be viewed as a coin toss, with Heads probability p (though of course we do not know the value of p !). And the coin tosses are independent. 1 What is the expectation of our estimate? E ( A n ) = E ( 1 n S n ) = 1 n E ( X 1 + X 2 + + X n ) = 1 n ( np ) = p . So for any value of n , our estimate will always have the correct expectation p . [Such a r.v. is often called an unbiased estimator of p .] Now presumably, as we increase our sample size n , our estimate should get more and more accurate. This will show up in the fact that the variance decreases with n : i.e., as n increases, the probability that we are far from the mean p will get smaller. To see this, we need to compute Var ( A n ) . But A n = 1 n n i = 1 X i , which is just a constant times a sum of independent random variables. So we can compute Var ( A n ) using the technology we established in the last lecture note: Var ( A n ) = Var ( 1 n n i = 1 X i ) = ( 1 n ) 2 Var ( n i = 1 X i ) = ( 1 n ) 2 n i = 1 Var ( X i ) = 2 n , where we have written 2 for the variance of each of the X i . So we see that the variance of A n decreases linearly with n . This fact ensures that, as we take larger and larger sample sizes n , the probability that we deviate much from the expectation p gets smaller and smaller. Lets now use Chebyshevs inequality to figure out how large n has to be to ensure a specified accuracy in our estimate of the proportion of Democrats p . A natural way to measure this is for us to specify two 1 We are assuming here that the sampling is done with replacement; i.e., we select each person in the sample from the entire population, including those we have already picked. So there is a small chance that we will pick the same person twice....
View Full Document
This note was uploaded on 09/06/2009 for the course CS 70 taught by Professor Papadimitrou during the Spring '08 term at University of California, Berkeley.
- Spring '08