Discrete-time stochastic processes

# In this and the next section we discuss two of these

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: marily for integer valued rv’s, but if one transform can be evaluated, the other can be found immediately. Finally, if we use −s, viewed as a complex variable, in place of r, we get the two sided Laplace transform of the density of the random variable. Note that for all of these transforms, multiplication in the transform domain corresponds to convolution of the distribution functions or densities, and summation of independent rv’s. It is the simplicity of taking products of transforms that make transforms so useful in probability theory. We will use transforms sparingly here, since, along with simplifying computational work, they sometimes obscure underlying probabilistic insights. 1.4 The laws of large numbers The laws of large numbers are a collection of results in probability theory that describe the behavior of the arithmetic averageP n rv’s for large n. For any n rv’s, X1 , . . . , Xn , of the arithmetic average is the rv (1/n) n Xi . Since in any outcome of the experiment, i=1 the sample value of this rv is the arithmetic average of the sample values of X1 , . . . , Xn , this random variable is called the sample average. If X1 , . . . , Xn are viewed as successive variables in time, this sample average is called the time-average. Under fairly general 26 CHAPTER 1. INTRODUCTION AND REVIEW OF PROBABILITY assumptions, the standard deviation of the sample average goes to 0 with increasing n, and, in various ways, depending on the assumptions, the sample average approaches the mean. These results are central to the study of stochastic processes because they allow us to relate time-averages (i.e., the average over time of individual sample paths) to ensemble-averages (i.e., the mean of the value of the process at a given time). In this and the next section, we discuss two of these results, the weak and the strong law of large numbers for independent identically distributed rv’s. The strong law requires considerable patience to understand, but it is a basic and useful result in studying stochastic processes. We start with some basic inequalities which are useful in their own right and which lead us directly to the weak law of large numbers. 1.4.1 Basic inequalities Inequalities play an unusually important role in probability theory, perhaps since so much of the theory is based on mathematical analysis, or perhaps because many of the problems are so complex that that important quantities can not be evaluated exactly, but can only be bounded. One of the simplest and most basic of these inequalities is the Markov inequality, which states that if a non-negative random variable Y has a mean E [Y ], then the probability that Y exceeds any given number y satisﬁes Pr {Y ≥ y } ≤ E [Y ] y Markov Inequality. (1.42) Figure 1.7 derives this result using the fact (see Figure 1.3) that the mean of a non-negative rv is the integral of its complementary distribution function, i.e., of the area under the curve Pr {Y > z }. Exercise 1.8 gives another simple proof using an indicator random variable. Area under curve = E [Y ] ✟✟❅ ✙ ✟ ❅ ❄ ❅Pr {Y ≥ y } ❘ ❅ Area = y Pr {Y ≥ y } y Figure 1.7: Demonstration that y Pr {Y ≥ y } ≤ E [Y ]. As an example of this inequality, assume that the average height of a population of people is 1.6 meters. Then the Markov inequality states that at most half of the population have a height exceeding 3.2 meters. We see from this example that the Markov inequality is often very weak. However, for any y > 0, we can consider a rv that takes on the value y with probability ≤ and the value 0 with probability 1 − ≤; this rv satisﬁes the Markov inequality at the point y with equality. Another application of Figure 1.7 is the observation that, for any given non-negative rv Y with ﬁnite mean, (i.e., with ﬁnite area under the curve 1.4. THE LAWS OF LARGE NUMBERS 27 Pr {Y ≥ y }), lim y Pr {Y ≥ y } = 0. y →1 (1.43) This is explained more fully by Exercise 1.31 and will be useful shortly in the proof of Theorem 1.3. We now use the Markov inequality to establish the probably better known Chebyshev 2 inequality. Let Z be an arbitrary rv with ﬁnite mean E [Z ] and ﬁnite variance σZ , and 2 . Thus E [Y ] = σ 2 . Applying (1.42), deﬁne Y as the non-negative rv Y = (Z − E [Z ]) Z © ™ σ2 Pr (Z − E [Z ])2 ≥ y ≤ Z . y Replacing y with ≤2 (for any ≤ > 0) and noting that the event {(Z − E [Z ])2 ≥ ≤2 } is the same as |Z − E [Z ] | ≥ ≤, this becomes Pr {|Z − E [Z ] | ≥ ≤} ≤ 2 σZ ≤2 (Chebyshev inequality). (1.44) Note that the Markov inequality bounds just the upper tail of the distribution function and applies only to non-negative rv’s, whereas the Chebyshev inequality bounds both tails of the distribution function. The more important diﬀerence, however, is that the Chebyshev bound goes to zero inversely with the square of the distance from the mean, whereas the Markov bound goes to zero inversely with the distance from 0 (and thus asymptoticall...
View Full Document

## This note was uploaded on 09/27/2010 for the course EE 229 taught by Professor R.srikant during the Spring '09 term at University of Illinois, Urbana Champaign.

Ask a homework question - tutors are online