This preview shows pages 1–2. Sign up to view the full content.
CS 70
Discrete Mathematics and Probability Theory
Spring 2010
Alistair Sinclair
Lecture 17
I.I.D. Random Variables
Estimating the bias of a coin
Question:
We want to estimate the proportion
p
of Democrats in the US population, by taking a small
random sample. How large does our sample have to be to guarantee that our estimate will be within (say)
10% (in relative terms) of the true value with probability at least 0.95?
This is perhaps the most basic statistical estimation problem, and shows up everywhere. We will develop
a simple solution that uses only Chebyshev’s inequality. More reﬁned methods can be used to get sharper
results.
Let’s denote the size of our sample by
n
(to be determined), and the number of Democrats in it by the
random variable
S
n
. (The subscript
n
just reminds us that the r.v. depends on the size of the sample.) Then
our estimate will be the value
A
n
=
1
n
S
n
.
Now as has often been the case, we will ﬁnd it helpful to write
S
n
=
X
1
+
X
2
+
···
+
X
n
, where
X
i
=
±
1
if person
i
in sample is a Democrat;
0
otherwise.
Note that each
X
i
can be viewed as a coin toss, with Heads probability
p
(though of course we do not know
the value of
p
!). And the coin tosses are independent.
1
We call such a family of random variables
indepen
dent, identically distributed
, or
i.i.d.
for short. (For a precise deﬁnition of
independent
random variables,
see the next lecture note; for now we work with the intuitive meaning that knowing the value of any subset
of the r.v.’s does not change the distribution of the others.)
What is the expectation of our estimate?
E
(
A
n
) =
E
(
1
n
S
n
) =
1
n
E
(
X
1
+
X
2
+
···
+
X
n
) =
1
n
×
(
np
) =
p
.
So for any value of
n
, our estimate will always have the correct expectation
p
. [Such a r.v. is often called an
unbiased estimator
of
p
.] Now presumably, as we increase our sample size
n
, our estimate should get more
and more accurate. This will show up in the fact that the
variance
decreases with
n
: i.e., as
n
increases, the
probability that we are far from the mean
p
will get smaller.
To see this, we need to compute Var
(
A
n
)
. But
A
n
=
1
n
∑
n
i
=
1
X
i
, which is just a constant times a sum of
independent
random variables.
Theorem 17.1
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
This is the end of the preview. Sign up
to
access the rest of the document.
 Spring '08
 PAPADIMITROU

Click to edit the document details