Discrete-time stochastic processes

# 149 n n n n this says that the standard deviation of

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: y with distance from the mean). There is another variant of the Markov inequality, known as an exponential bound or Chernoﬀ bound, in which the bound goes to 0 exponentially with distance from the mean. Let Y = exp(rZ ) for some arbitrary rv Z that has a moment generating function, gZ (r) = E [exp(rZ )] over some open interval of real values of r including r = 0. Then, for any r in that interval, (1.42) becomes Pr {exp(rZ ) ≥ y } ≤ gZ (r) . y Letting y = exp(ra) for some constant a, we have the following two inequalities, Pr {Z ≥a} ≤ gZ (r) exp(−ra) ; (Exponential bound for r ≥ 0). (1.45) Pr {Z ≤a} ≤ gZ (r) exp(−ra) ; (Exponential bound for r ≤ 0). (1.46) Note that the right hand side of (1.45) is one for r = 0 and its derivitive with respect to r is Z − a at r = 0. Thus, for any a > Z , this bound is less than one for small enough r > 0, i.e., Pr {Z ≥a} ≤ gZ (r) exp(−ra) < 1 ; for a > Z and suﬃciently small r > 0. (1.47) 28 CHAPTER 1. INTRODUCTION AND REVIEW OF PROBABILITY Note that for ﬁxed r > 0, this bound decreases exponentially with increasing a. Similarly, for r < 0 and a < Z , (1.46) satisﬁes Pr {Z ≤a} ≤ gZ (r) exp(−ra) < 1 ; for a < Z and r < 0 suﬃciently close to 0. (1.48) The bound in (1.46) is less than 1 for negative r suﬃciently close to 0. This bound also decreases exponentially with decreasing a. Both these bounds can be optimized over r to get the strongest bound for any given a. These bounds will be used in the next section to prove the strong law of large numbers. They will also be used extensively in Chapter 7 and are useful for detection, random walks, and information theory. 1.4.2 Weak law of large numbers with a ﬁnite variance 2 Let X1 , X2 , . . . , Xn be IID rv’s with a ﬁnite mean X and ﬁnite variance σX , let Sn = 2 2 X1 + · · · + Xn , and consider the sample average Sn /n. We saw in (1.32) that σSn = nσX . Thus the variance of Sn /n is "µ µ∂ ∂2 # ¢2 i σ 2 Sn Sn − nX 1 h° VAR =E = 2 E Sn − nX = . (1.49) n n n n √ This says that the standard deviation of the sample average Sn /n is σ / n, which approaches 0 as n increases. Figure 1.8 illustrates this decrease in the standard deviation of Sn /n with increasing n. In contrast, recall that Figure 1.5 illustrated how the standard deviation of Sn increases with n. From (1.49), we see that "µ ∂2 # Sn lim E −X = 0. (1.50) n→1 n As a result, we say that Sn /n converges in mean square to X . This convergence in mean square essentially says that the sample average approaches the mean with increasing n. This connection of the sample average (which is a random variable) to the mean (which is a ﬁxed number) is probably the most important conceptual result in probability theory. There are many ways of expressing this concept and many ways of relaxing the IID and ﬁnite variance conditions. Perhaps the most insightful way of expressing the concept is the following weak law of large numbers (it is called a weak law to contrast it with the strong law of large numbers, which will be discussed later in this section.) Theorem 1.2 (Weak law with ﬁnite variance). Let Sn = X1 + · · · + Xn be the sum of n IID rv’s with a ﬁnite variance. Then the fol lowing equivalent conditions hold: ﬁrst, ΩØ æ Ø Ø Sn Ø lim Pr Ø − XØ ≥ ≤ = 0 for every ≤ > 0. (1.51) n→1 n Second, a real-valued function f (≤, δ ) exists such that for every ≤ > 0 and every δ > 0, ΩØ æ Ø Ø Sn Ø Pr Ø − XØ ≥ ≤ ≤ δ for al l n ≥ f (≤, δ ). (1.52) n 1.4. THE LAWS OF LARGE NUMBERS 29 1 · · · · · · · · · · · · · · 0.8 · · · · · · · · · · · · · · 0.6 · · · · · · · · · Yn = Sn n · · 0.4 · · · · · · · · · · · 0.2 · · · · · · · · · · · FYn (z ) n=4 n = 20 n = 50 · · · 0 0 0.25 0.5 0.75 1 Figure 1.8: The same distribution as Figure 1.5, scaled diﬀerently to give the distribution function of the sample average Yn . Note that as n increases, the distribution function of Yn slowly becomes closer to a unit step at the mean, 0.25, of the variables X being summed. Discussion For any arbitrarily small ≤ > 0, (1.51) says that limn→1 Pr {An } = 0 where An is the event that the sample average Sn /n diﬀers from the true mean by more than ≤. This means (see Figure 1.9) that the distribution function of Sn /n approaches a unit step function with the step at X as n → 1. Because of (1.51), Sn /n is said to converge to E [X ] in probability. Figure 1.9 also illustrates how the ≤ and δ of (1.52) control the approximation of FSn /n for ﬁnite n by a unit step. Proof: For any given ≤ > 0, we can apply the Chebyshev inequality, (1.44), to the sample average Sn /n, getting ΩØ æ Ø σ2 Ø Sn Ø Pr Ø − XØ ≥ ≤ ≤ 2 . (1.53) n n≤ where σ 2 is the variance of each Xi . For the given ≤ >©0, the righthand side of (1.53) ™ is decreasing towar...
View Full Document

## This note was uploaded on 09/27/2010 for the course EE 229 taught by Professor R.srikant during the Spring '09 term at University of Illinois, Urbana Champaign.

Ask a homework question - tutors are online