This preview shows page 1. Sign up to view the full content.
Unformatted text preview: ready to state the most useful version of the strong law. The statement is
deceptively simple, and it will take some care to interpret what it means.
Theorem 1.5 (Strong Law of Large Numbers (SLLN  Version 2)). Let {Xm ; m ≥
1} be a sequence of IID rv’s with a ﬁnite mean X . For each n, let Sn = X1 + · · · + Xn .
Then
Ω
æ
Sn
Pr lim
= X = 1.
(1.71)
n→1 n 38 CHAPTER 1. INTRODUCTION AND REVIEW OF PROBABILITY Before proving the theorem, we must be clear about the meaning of the limit of a sequence
of rv’s. For each sample point ω in the sample space, limn→1 Sn (ω )/n is the limit of a
sequence of real numbers. If the limit exists, it is a real number and otherwise the limit
is undeﬁned. Thus the limit is a possibly defective random variable, mapping each sample
point to a real number or to a black hole.
The theorem then says three things: ﬁrst, the set of ω ∈ ≠ for which the limit exists is an
event19 according to the axioms of probability; second, this event has probability 1; and
third, the limiting rv is deterministic and equal to X with probability 1.
A sequence of rv’s {Sn /n; n ≥ 1} that converges in the sense of (1.71) is said to converge
with probability 1. Thus we usually express (1.71) as limn→1 Sn /n = X with probability 1,
or even more brieﬂy as limn→1 Sn /n = X W.P.1.
Proof: Surprisingly, the subtle mathematical issues about convergence of rv’s can be
avoided if we look carefully at Version 1 of the SLLN. In the form of (1.70),
Ø
Ω[
ΩØ
ææ
\
[
Ø Sm
Ø1
Ø
Ø>
Pr
− XØ
= 0.
k≥1
n≥1
m≥n Ø m
k The complement of the event above must have probability 1, and using de Morgan’s laws
to ﬁnd the complement, we get
Ø
Ω\
ΩØ
ææ
[
\
Ø Sm
Ø1
Ø
Ø≤
Pr
− XØ
= 1.
(1.72)
k≥1
n≥1
m≥n Ø m
k To better understand this expression, deﬁne the event Ak as
Ø
ΩØ
æ
[
\
Ø Sm
Ø1
Ø
Ak =
− XØ ≤
Øk
n≥1
m≥n Ø m
T
T
Thus Version 1 of the SLLN states that Pr { k Ak } = 1. We now show that k Ak is simply
the set of ω for which limn Sn (ω )/n = X . To see this, note that ω ∈ Ak means that there is
T
some n ≥ 1 such that Sm (ω )/m − X  ≤ 1/k for all m ≥ n. It then follows that ω ∈ k Ak
means that for every k ≥ 1, there is some n ≥ 1 such that Sm (ω )/m − X  ≤ 1/k for all
m ≥ n. This is the deﬁnition of limn Sn (ω )/n = X . This means that the set of ω for
T
which limn Sn (ω )/n = X is the set of ω in k Ak . Thus this set is an event (as a countable
intersection of a countable union or a countable intersection of events, and this event has
probability 1.
19
This is the only place in this text where one must seriously question whether a subset of sample points
is actually an event. The extreme peculiarity of this subset can be seen by visualizing a Bernoulli process
with pX (1) = q . In this case, E [X ] = q and, according to the theorem, the set of sample paths for which
Sm /m → q has probability 1. If we consider another Bernoulli process with pX (1) = q 0 , q 0 6= q , then the old
set of sample paths with Sm /m → q has probability 0 and the new set with Sm /m → q 0 has probability 1.
There are uncountably many choices for q , so there are uncountably many sets, each of which has probability
1 for its own q . Each of these sets are events in the same class of events. Thus they partition the sample
space into an uncountable set of events, each with probability 1 for its own q , and in addition there are the
sample points for which Sm /m does not have a limit. There is nothing wrong mathematically here, since
these sets are described by countable unions and intersections. However, even the simple friendly Bernoulli
process is very strange when one looks at events of this type. 1.4. THE LAWS OF LARGE NUMBERS 39 Example 1.4.2. Suppose the Xi are IID binary with equiprobable ones and zeros. Then
X = 1/2. We can easily construct sequences for which the sample average is not 1/2; for
example the sequence of all zeros, the sequence of all ones, sequences with 1/3 zeros and
2/3 ones, and so forth. The theorem says, however, that collectively those sequences have
zero probability.
The proof above actually shows more than the theorem claims. It shows that Version 1 and
Version 2 of the SLLN are actually equivalent statements in the same sense that (1.61) and
(1.62) are equivalent statments. Each form leads to its own set of insights, but when we
show that other sequences of rv’s satisfy one or the other of these forms, we can recognize
almost immediately that the other equivalent form is also valid. 1.4.7 Convergence of random variables This section has developed a number of laws of large numbers, each saying that a sum of
many IID random variables (rv’s), suitably normalized, is essentially equal to the mean.
In the case of the CLT, the limiting distribution around the mean is also speciﬁed to be
Gaussian. At the outermost intuitive level, i.e., at the level most useful when ﬁrst looking
at some very complicated set of issues, viewing the sample a...
View
Full
Document
This note was uploaded on 09/27/2010 for the course EE 229 taught by Professor R.srikant during the Spring '09 term at University of Illinois, Urbana Champaign.
 Spring '09
 R.Srikant

Click to edit the document details