CS 70 Lecture 17

# CS 70 Lecture 17 - CS 70 Spring 2010 Discrete Mathematics...

This preview shows pages 1–2. Sign up to view the full content.

CS 70 Discrete Mathematics and Probability Theory Spring 2010 Alistair Sinclair Lecture 17 I.I.D. Random Variables Estimating the bias of a coin Question: We want to estimate the proportion p of Democrats in the US population, by taking a small random sample. How large does our sample have to be to guarantee that our estimate will be within (say) 10% (in relative terms) of the true value with probability at least 0.95? This is perhaps the most basic statistical estimation problem, and shows up everywhere. We will develop a simple solution that uses only Chebyshev’s inequality. More reﬁned methods can be used to get sharper results. Let’s denote the size of our sample by n (to be determined), and the number of Democrats in it by the random variable S n . (The subscript n just reminds us that the r.v. depends on the size of the sample.) Then our estimate will be the value A n = 1 n S n . Now as has often been the case, we will ﬁnd it helpful to write S n = X 1 + X 2 + ··· + X n , where X i = ± 1 if person i in sample is a Democrat; 0 otherwise. Note that each X i can be viewed as a coin toss, with Heads probability p (though of course we do not know the value of p !). And the coin tosses are independent. 1 We call such a family of random variables indepen- dent, identically distributed , or i.i.d. for short. (For a precise deﬁnition of independent random variables, see the next lecture note; for now we work with the intuitive meaning that knowing the value of any subset of the r.v.’s does not change the distribution of the others.) What is the expectation of our estimate? E ( A n ) = E ( 1 n S n ) = 1 n E ( X 1 + X 2 + ··· + X n ) = 1 n × ( np ) = p . So for any value of n , our estimate will always have the correct expectation p . [Such a r.v. is often called an unbiased estimator of p .] Now presumably, as we increase our sample size n , our estimate should get more and more accurate. This will show up in the fact that the variance decreases with n : i.e., as n increases, the probability that we are far from the mean p will get smaller. To see this, we need to compute Var ( A n ) . But A n = 1 n n i = 1 X i , which is just a constant times a sum of independent random variables. Theorem 17.1

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 4

CS 70 Lecture 17 - CS 70 Spring 2010 Discrete Mathematics...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online