Discrete-time stochastic processes

# For brevity assuming some xed let sm am 163

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: a model and the reality being modeled. These more general laws often go under the term ergodic theory. 1.4.5 Weak law with an inﬁnite variance We now establish the law of large numbers without assuming a ﬁnite variance. Theorem 1.3 (Weak Law of Large Numbers). Let Sn = X1 +· · ·+Xn where X1 , X2 , . . . are IID rv’s with a ﬁnite mean E [X ]. Then for any ≤ > 0, ΩØ æ Ø Ø Sn Ø lim Pr Ø − E [X ] Ø ≥ ≤ = 0. (1.57) n→1 n Outline of Proof We use a truncation argument; such arguments are used frequently in dealing with rv’s that have inﬁnite variance. Let b be a real number (which we later take ˘ to be increasing with n), and for each variable Xi , deﬁne a new rv Xi (see Figure 1.11) by for E [X ] − b ≤ Xi ≤ E [X ] + b Xi ˘i = E [X ] + b for Xi > E [X ] + b X (1.58) E [X ] − b for Xi < E [X ] − b. X +b FX X FX ˘ X −b X ˘ Figure 1.11: The truncated rv X for a given rv X has a distribution function which is truncated at X ± b.. ˘ The truncated variables Xi are IID with a ﬁnite second moment. Also the ﬁrst moment approaches the mean of the original variables Xi with increasing b. Thus, as shown in ˘ ˘ ˘ Exercise 1.26, the sum Sn = X1 + · · · + Xn of the truncated rv’s satisﬁes the following type of weak law of large numbers: Ø (Ø ) ØS Ø 8bE [|X |] Ø ˘n Ø Pr Ø − E [X ]Ø ≥ ≤ ≤ . (1.59) Øn Ø n≤2 18 Central limit theorems also hold in many of these more general situations, but they usually do not have quite the generality of the laws of large numbers 34 CHAPTER 1. INTRODUCTION AND REVIEW OF PROBABILITY for suﬃciently large b. ˘ The original sum Sn is the same as Sn unless one of the Xi has an outage, i.e., |Xi − X | > b, n o © ™ ˘ so, using the union bound, Pr Sn 6= Sn ≤ nPr |Xi − X | > b . Combining with (1.59), Ø ΩØ æ Ø Sn Ø Ø Ø ≥ ≤ ≤ 8bE [|X |] + n [b Pr { |X − E [X ] | > b}] . Pr Ø − E [X ] Ø n n≤2 b (1.60) The trick of the proof is to show that by letting b and n both approach inﬁnity in the appropriate ratio, both of these terms go to 0, establishing the theorem. The way to p this do © ™ is let g (b) = bPr |X − X | > b . From (1.43), limb→1 g (b) = 0. Choosing n by b/n = g (b), we see that both terms in (1.60) approach 0 with incresing b and n, completing the proof. 1.4.6 Strong law of large numbers (SLLN) We next discuss the strong law of large numbers. We do not have the mathematical tools to prove the theorem in its full generality, but will give a fairly insightful proof under the additional assumption that the rv under discussion has a moment generating function. Theorem 1.4 (SLLN (Version 1)). Let Sn = X1 + · · · + Xn where X1 , X2 , . . . are IID rv’s with a ﬁnite mean X . Then for every ≤ > 0, Ω[ ΩØ ææ Ø Ø Sm Ø lim Pr − XØ > ≤ = 0. (1.61) Ø m≥n n→1 m As discussed later, an equivalent statement is Ø Ω\ ΩØ ææ [ Ø Sm Ø Ø Ø>≤ Pr − XØ = 0. n≥1 m≥n Ø m (1.62) For those who do not eat countable unions and intersections for breakfast, we describe what these equations mean before proving the theorem. This understanding is perhaps as diﬃcult as following the proof. For brevity, assuming some ﬁxed ε, let Ø ΩØ æ Ø Sm Ø Ø Am = Ø (1.63) Ø m − XØ > ≤ . This is the event that the sample average for m trials diﬀers from the mean by more than the given ≤. The weak law asserts that limn→1 Pr {An } = 0. The strong law (in the form nS o (1.61)) asserts that limn→1 Pr Am = 0. This means that not only is An unlikely, m≥n but that all subsequent Am are collectively unlikely. nS o Example 1.4.1. The diﬀerence between limn→1 Pr Am and limn→1 Pr {An } can m≥n be more easily understood by looking at a simpler sequence of events than those in (1.63). Let U be a uniformly distributed random variable over the interval [0, 1). Let An , n ≥ 1 be a sequence of events locating the sample value of U with greater and greater precision. 1.4. THE LAWS OF LARGE NUMBERS 35 In particular, let A1 = {0 ≤ U < 1} be the entire interval, A2 = {0 ≤ U < 1/2} and A3 = {1/2 ≤ U < 1} be the two half intervals. Then let A4 = {0 ≤ U < 1/4}, . . . , A7 = {3/4 ≤ U < 1} be the quarter intervals, etc. For each positive integer k, then, Pr {An } = 2−k for n in the range 2k ≤ n < 2k+1 . We see that limn→1 Pr {An } = 0. On the other hand, for each k, any given sample value u S k+1 must lie in An for some n between 2k and 2k+1 . In other words, 2 =2k−1 An = ≠. It follows n nS o nS o that Pr Am = 1 for all n. Taking the limit as n → 1, limn→1 Pr Am = 1. m≥n m≥n The example shows that, for this sequence of events, there is a considerable diﬀerence between the probability of the nth event and the collective probability of one or more events from n on. S We next want to understand why (1.61) and (1.62) say the same thing. Let Bn = m≥n Am . The sequence {Bn ; n ≥ 1} is a sequence of unions, and each event Bn is modiﬁed from the previous event Bn−1 by the removal of An−1 . Thus these unions are ne...
View Full Document

## This note was uploaded on 09/27/2010 for the course EE 229 taught by Professor R.srikant during the Spring '09 term at University of Illinois, Urbana Champaign.

Ask a homework question - tutors are online