This preview shows page 1. Sign up to view the full content.
Unformatted text preview: y
with distance from the mean).
There is another variant of the Markov inequality, known as an exponential bound or
Chernoﬀ bound, in which the bound goes to 0 exponentially with distance from the mean.
Let Y = exp(rZ ) for some arbitrary rv Z that has a moment generating function, gZ (r) =
E [exp(rZ )] over some open interval of real values of r including r = 0. Then, for any r in
that interval, (1.42) becomes
Pr {exp(rZ ) ≥ y } ≤ gZ (r)
.
y Letting y = exp(ra) for some constant a, we have the following two inequalities,
Pr {Z ≥a} ≤ gZ (r) exp(−ra) ; (Exponential bound for r ≥ 0). (1.45) Pr {Z ≤a} ≤ gZ (r) exp(−ra) ; (Exponential bound for r ≤ 0). (1.46) Note that the right hand side of (1.45) is one for r = 0 and its derivitive with respect to r
is Z − a at r = 0. Thus, for any a > Z , this bound is less than one for small enough r > 0,
i.e.,
Pr {Z ≥a} ≤ gZ (r) exp(−ra) < 1 ; for a > Z and suﬃciently small r > 0. (1.47) 28 CHAPTER 1. INTRODUCTION AND REVIEW OF PROBABILITY Note that for ﬁxed r > 0, this bound decreases exponentially with increasing a. Similarly,
for r < 0 and a < Z , (1.46) satisﬁes
Pr {Z ≤a} ≤ gZ (r) exp(−ra) < 1 ; for a < Z and r < 0 suﬃciently close to 0. (1.48) The bound in (1.46) is less than 1 for negative r suﬃciently close to 0. This bound also
decreases exponentially with decreasing a. Both these bounds can be optimized over r to
get the strongest bound for any given a. These bounds will be used in the next section to
prove the strong law of large numbers. They will also be used extensively in Chapter 7 and
are useful for detection, random walks, and information theory. 1.4.2 Weak law of large numbers with a ﬁnite variance 2
Let X1 , X2 , . . . , Xn be IID rv’s with a ﬁnite mean X and ﬁnite variance σX , let Sn =
2
2
X1 + · · · + Xn , and consider the sample average Sn /n. We saw in (1.32) that σSn = nσX .
Thus the variance of Sn /n is
"µ
µ∂
∂2 #
¢2 i σ 2
Sn
Sn − nX
1 h°
VAR
=E
= 2 E Sn − nX
=
.
(1.49)
n
n
n
n
√
This says that the standard deviation of the sample average Sn /n is σ / n, which approaches
0 as n increases. Figure 1.8 illustrates this decrease in the standard deviation of Sn /n with
increasing n. In contrast, recall that Figure 1.5 illustrated how the standard deviation of
Sn increases with n. From (1.49), we see that
"µ
∂2 #
Sn
lim E
−X
= 0.
(1.50)
n→1
n As a result, we say that Sn /n converges in mean square to X .
This convergence in mean square essentially says that the sample average approaches the
mean with increasing n. This connection of the sample average (which is a random variable)
to the mean (which is a ﬁxed number) is probably the most important conceptual result
in probability theory. There are many ways of expressing this concept and many ways
of relaxing the IID and ﬁnite variance conditions. Perhaps the most insightful way of
expressing the concept is the following weak law of large numbers (it is called a weak law
to contrast it with the strong law of large numbers, which will be discussed later in this
section.)
Theorem 1.2 (Weak law with ﬁnite variance). Let Sn = X1 + · · · + Xn be the sum of
n IID rv’s with a ﬁnite variance. Then the fol lowing equivalent conditions hold: ﬁrst,
ΩØ
æ
Ø
Ø Sn
Ø
lim Pr Ø
− XØ ≥ ≤ = 0
for every ≤ > 0.
(1.51)
n→1
n
Second, a realvalued function f (≤, δ ) exists such that for every ≤ > 0 and every δ > 0,
ΩØ
æ
Ø
Ø Sn
Ø
Pr Ø
− XØ ≥ ≤ ≤ δ
for al l n ≥ f (≤, δ ).
(1.52)
n 1.4. THE LAWS OF LARGE NUMBERS 29 1 · · · · · · · · · · · · · · 0.8 · · · · · · · · · · · · · · 0.6 · · · · · · · ·
·
Yn = Sn
n · · 0.4 · · · · · · · · · · · 0.2 · · · · · · · · · · · FYn (z ) n=4
n = 20
n = 50 · · · 0
0 0.25 0.5 0.75 1 Figure 1.8: The same distribution as Figure 1.5, scaled diﬀerently to give the distribution function of the sample average Yn . Note that as n increases, the distribution
function of Yn slowly becomes closer to a unit step at the mean, 0.25, of the variables
X being summed. Discussion For any arbitrarily small ≤ > 0, (1.51) says that limn→1 Pr {An } = 0 where
An is the event that the sample average Sn /n diﬀers from the true mean by more than ≤.
This means (see Figure 1.9) that the distribution function of Sn /n approaches a unit step
function with the step at X as n → 1. Because of (1.51), Sn /n is said to converge to E [X ]
in probability. Figure 1.9 also illustrates how the ≤ and δ of (1.52) control the approximation
of FSn /n for ﬁnite n by a unit step.
Proof: For any given ≤ > 0, we can apply the Chebyshev inequality, (1.44), to the sample
average Sn /n, getting
ΩØ
æ
Ø
σ2
Ø Sn
Ø
Pr Ø
− XØ ≥ ≤ ≤ 2 .
(1.53)
n
n≤ where σ 2 is the variance of each Xi . For the given ≤ >©0, the righthand side of (1.53)
™
is decreasing towar...
View Full
Document
 Spring '09
 R.Srikant

Click to edit the document details