This preview shows pages 1–2. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: CS 702 Discrete Mathematics and Probability Theory Spring 2009 Alistair Sinclair, David Tse Lecture 19 I.I.D. Random Variables Estimating the bias of a coin Question: We want to estimate the proportion p of Democrats in the US population, by taking a small random sample. How large does our sample have to be to guarantee that our estimate will be within (say) 10% (in relative terms) of the true value with probability at least 0.95? This is perhaps the most basic statistical estimation problem, and shows up everywhere. We will develop a simple solution that uses only Chebyshevs inequality. More refined methods can be used to get sharper results. Lets denote the size of our sample by n (to be determined), and the number of Democrats in it by the random variable S n . (The subscript n just reminds us that the r.v. depends on the size of the sample.) Then our estimate will be the value A n = 1 n S n . Now as has often been the case, we will find it helpful to write S n = X 1 + X 2 + + X n , where X i = braceleftBigg 1 if person i in sample is a Democrat; otherwise. Note that each X i can be viewed as a coin toss, with Heads probability p (though of course we do not know the value of p !). And the coin tosses are independent. 1 What is the expectation of our estimate? E ( A n ) = E ( 1 n S n ) = 1 n E ( X 1 + X 2 + + X n ) = 1 n ( np ) = p . So for any value of n , our estimate will always have the correct expectation p . [Such a r.v. is often called an unbiased estimator of p .] Now presumably, as we increase our sample size n , our estimate should get more and more accurate. This will show up in the fact that the variance decreases with n : i.e., as n increases, the probability that we are far from the mean p will get smaller. To see this, we need to compute Var ( A n ) . But A n = 1 n n i = 1 X i , which is just a constant times a sum of independent random variables. So we can compute Var ( A n ) using the technology we established in the last lecture note: Var ( A n ) = Var ( 1 n n i = 1 X i ) = ( 1 n ) 2 Var ( n i = 1 X i ) = ( 1 n ) 2 n i = 1 Var ( X i ) = 2 n , where we have written 2 for the variance of each of the X i . So we see that the variance of A n decreases linearly with n . This fact ensures that, as we take larger and larger sample sizes n , the probability that we deviate much from the expectation p gets smaller and smaller. Lets now use Chebyshevs inequality to figure out how large n has to be to ensure a specified accuracy in our estimate of the proportion of Democrats p . A natural way to measure this is for us to specify two 1 We are assuming here that the sampling is done with replacement; i.e., we select each person in the sample from the entire population, including those we have already picked. So there is a small chance that we will pick the same person twice....
View
Full
Document
This note was uploaded on 09/06/2009 for the course CS 70 taught by Professor Papadimitrou during the Spring '08 term at University of California, Berkeley.
 Spring '08
 PAPADIMITROU

Click to edit the document details