This preview shows page 1. Sign up to view the full content.
Unformatted text preview: marily
for integer valued rv’s, but if one transform can be evaluated, the other can be found
immediately. Finally, if we use −s, viewed as a complex variable, in place of r, we get the
two sided Laplace transform of the density of the random variable. Note that for all of
these transforms, multiplication in the transform domain corresponds to convolution of the
distribution functions or densities, and summation of independent rv’s. It is the simplicity
of taking products of transforms that make transforms so useful in probability theory. We
will use transforms sparingly here, since, along with simplifying computational work, they
sometimes obscure underlying probabilistic insights. 1.4 The laws of large numbers The laws of large numbers are a collection of results in probability theory that describe
the behavior of the arithmetic averageP n rv’s for large n. For any n rv’s, X1 , . . . , Xn ,
of
the arithmetic average is the rv (1/n) n Xi . Since in any outcome of the experiment,
i=1
the sample value of this rv is the arithmetic average of the sample values of X1 , . . . , Xn ,
this random variable is called the sample average. If X1 , . . . , Xn are viewed as successive
variables in time, this sample average is called the timeaverage. Under fairly general 26 CHAPTER 1. INTRODUCTION AND REVIEW OF PROBABILITY assumptions, the standard deviation of the sample average goes to 0 with increasing n, and,
in various ways, depending on the assumptions, the sample average approaches the mean.
These results are central to the study of stochastic processes because they allow us to relate
timeaverages (i.e., the average over time of individual sample paths) to ensembleaverages
(i.e., the mean of the value of the process at a given time). In this and the next section, we
discuss two of these results, the weak and the strong law of large numbers for independent
identically distributed rv’s. The strong law requires considerable patience to understand,
but it is a basic and useful result in studying stochastic processes. We start with some basic
inequalities which are useful in their own right and which lead us directly to the weak law
of large numbers. 1.4.1 Basic inequalities Inequalities play an unusually important role in probability theory, perhaps since so much
of the theory is based on mathematical analysis, or perhaps because many of the problems
are so complex that that important quantities can not be evaluated exactly, but can only be
bounded. One of the simplest and most basic of these inequalities is the Markov inequality,
which states that if a nonnegative random variable Y has a mean E [Y ], then the probability
that Y exceeds any given number y satisﬁes
Pr {Y ≥ y } ≤ E [Y ]
y Markov Inequality. (1.42) Figure 1.7 derives this result using the fact (see Figure 1.3) that the mean of a nonnegative
rv is the integral of its complementary distribution function, i.e., of the area under the curve
Pr {Y > z }. Exercise 1.8 gives another simple proof using an indicator random variable.
Area under curve = E [Y ]
✟✟❅
✙
✟
❅
❄ ❅Pr {Y ≥ y }
❘
❅
Area = y Pr {Y ≥ y }
y Figure 1.7: Demonstration that y Pr {Y ≥ y } ≤ E [Y ].
As an example of this inequality, assume that the average height of a population of people
is 1.6 meters. Then the Markov inequality states that at most half of the population have a
height exceeding 3.2 meters. We see from this example that the Markov inequality is often
very weak. However, for any y > 0, we can consider a rv that takes on the value y with
probability ≤ and the value 0 with probability 1 − ≤; this rv satisﬁes the Markov inequality
at the point y with equality. Another application of Figure 1.7 is the observation that,
for any given nonnegative rv Y with ﬁnite mean, (i.e., with ﬁnite area under the curve 1.4. THE LAWS OF LARGE NUMBERS 27 Pr {Y ≥ y }),
lim y Pr {Y ≥ y } = 0. y →1 (1.43) This is explained more fully by Exercise 1.31 and will be useful shortly in the proof of
Theorem 1.3.
We now use the Markov inequality to establish the probably better known Chebyshev
2
inequality. Let Z be an arbitrary rv with ﬁnite mean E [Z ] and ﬁnite variance σZ , and
2 . Thus E [Y ] = σ 2 . Applying (1.42),
deﬁne Y as the nonnegative rv Y = (Z − E [Z ])
Z
©
™ σ2
Pr (Z − E [Z ])2 ≥ y ≤ Z .
y Replacing y with ≤2 (for any ≤ > 0) and noting that the event {(Z − E [Z ])2 ≥ ≤2 } is the
same as Z − E [Z ]  ≥ ≤, this becomes
Pr {Z − E [Z ]  ≥ ≤} ≤ 2
σZ
≤2 (Chebyshev inequality). (1.44) Note that the Markov inequality bounds just the upper tail of the distribution function and
applies only to nonnegative rv’s, whereas the Chebyshev inequality bounds both tails of
the distribution function. The more important diﬀerence, however, is that the Chebyshev
bound goes to zero inversely with the square of the distance from the mean, whereas the
Markov bound goes to zero inversely with the distance from 0 (and thus asymptoticall...
View Full
Document
 Spring '09
 R.Srikant

Click to edit the document details