Discrete-time stochastic processes

149 n n n n this says that the standard deviation of

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: y with distance from the mean). There is another variant of the Markov inequality, known as an exponential bound or Chernoff bound, in which the bound goes to 0 exponentially with distance from the mean. Let Y = exp(rZ ) for some arbitrary rv Z that has a moment generating function, gZ (r) = E [exp(rZ )] over some open interval of real values of r including r = 0. Then, for any r in that interval, (1.42) becomes Pr {exp(rZ ) ≥ y } ≤ gZ (r) . y Letting y = exp(ra) for some constant a, we have the following two inequalities, Pr {Z ≥a} ≤ gZ (r) exp(−ra) ; (Exponential bound for r ≥ 0). (1.45) Pr {Z ≤a} ≤ gZ (r) exp(−ra) ; (Exponential bound for r ≤ 0). (1.46) Note that the right hand side of (1.45) is one for r = 0 and its derivitive with respect to r is Z − a at r = 0. Thus, for any a > Z , this bound is less than one for small enough r > 0, i.e., Pr {Z ≥a} ≤ gZ (r) exp(−ra) < 1 ; for a > Z and sufficiently small r > 0. (1.47) 28 CHAPTER 1. INTRODUCTION AND REVIEW OF PROBABILITY Note that for fixed r > 0, this bound decreases exponentially with increasing a. Similarly, for r < 0 and a < Z , (1.46) satisfies Pr {Z ≤a} ≤ gZ (r) exp(−ra) < 1 ; for a < Z and r < 0 sufficiently close to 0. (1.48) The bound in (1.46) is less than 1 for negative r sufficiently close to 0. This bound also decreases exponentially with decreasing a. Both these bounds can be optimized over r to get the strongest bound for any given a. These bounds will be used in the next section to prove the strong law of large numbers. They will also be used extensively in Chapter 7 and are useful for detection, random walks, and information theory. 1.4.2 Weak law of large numbers with a finite variance 2 Let X1 , X2 , . . . , Xn be IID rv’s with a finite mean X and finite variance σX , let Sn = 2 2 X1 + · · · + Xn , and consider the sample average Sn /n. We saw in (1.32) that σSn = nσX . Thus the variance of Sn /n is "µ µ∂ ∂2 # ¢2 i σ 2 Sn Sn − nX 1 h° VAR =E = 2 E Sn − nX = . (1.49) n n n n √ This says that the standard deviation of the sample average Sn /n is σ / n, which approaches 0 as n increases. Figure 1.8 illustrates this decrease in the standard deviation of Sn /n with increasing n. In contrast, recall that Figure 1.5 illustrated how the standard deviation of Sn increases with n. From (1.49), we see that "µ ∂2 # Sn lim E −X = 0. (1.50) n→1 n As a result, we say that Sn /n converges in mean square to X . This convergence in mean square essentially says that the sample average approaches the mean with increasing n. This connection of the sample average (which is a random variable) to the mean (which is a fixed number) is probably the most important conceptual result in probability theory. There are many ways of expressing this concept and many ways of relaxing the IID and finite variance conditions. Perhaps the most insightful way of expressing the concept is the following weak law of large numbers (it is called a weak law to contrast it with the strong law of large numbers, which will be discussed later in this section.) Theorem 1.2 (Weak law with finite variance). Let Sn = X1 + · · · + Xn be the sum of n IID rv’s with a finite variance. Then the fol lowing equivalent conditions hold: first, ΩØ æ Ø Ø Sn Ø lim Pr Ø − XØ ≥ ≤ = 0 for every ≤ > 0. (1.51) n→1 n Second, a real-valued function f (≤, δ ) exists such that for every ≤ > 0 and every δ > 0, ΩØ æ Ø Ø Sn Ø Pr Ø − XØ ≥ ≤ ≤ δ for al l n ≥ f (≤, δ ). (1.52) n 1.4. THE LAWS OF LARGE NUMBERS 29 1 · · · · · · · · · · · · · · 0.8 · · · · · · · · · · · · · · 0.6 · · · · · · · · · Yn = Sn n · · 0.4 · · · · · · · · · · · 0.2 · · · · · · · · · · · FYn (z ) n=4 n = 20 n = 50 · · · 0 0 0.25 0.5 0.75 1 Figure 1.8: The same distribution as Figure 1.5, scaled differently to give the distribution function of the sample average Yn . Note that as n increases, the distribution function of Yn slowly becomes closer to a unit step at the mean, 0.25, of the variables X being summed. Discussion For any arbitrarily small ≤ > 0, (1.51) says that limn→1 Pr {An } = 0 where An is the event that the sample average Sn /n differs from the true mean by more than ≤. This means (see Figure 1.9) that the distribution function of Sn /n approaches a unit step function with the step at X as n → 1. Because of (1.51), Sn /n is said to converge to E [X ] in probability. Figure 1.9 also illustrates how the ≤ and δ of (1.52) control the approximation of FSn /n for finite n by a unit step. Proof: For any given ≤ > 0, we can apply the Chebyshev inequality, (1.44), to the sample average Sn /n, getting ΩØ æ Ø σ2 Ø Sn Ø Pr Ø − XØ ≥ ≤ ≤ 2 . (1.53) n n≤ where σ 2 is the variance of each Xi . For the given ≤ >©0, the righthand side of (1.53) ™ is decreasing towar...
View Full Document

Ask a homework question - tutors are online