Unformatted text preview: ECE 6010
Lecture 5 – Sequences and Limit Theorems Convergent sequences of real numbers and functions
Deﬁnition 1 Let x 1 , x 2 , . . . be a sequence of real numbers. This sequence converges to a
point x R if for every > 0 there is an N Z such that x n x < for all n N.
We write x n
x , or limn
xn x .
2
For real numbers (which are complete), a necessary and sufﬁcient condition: ¡ ¢ xn ©¢ ¢ £ §
¦ m >n £ © lim sup x m n ¢ §
¨¦ ¥ converges § n1 ¤ xn 0. The latter condition says that x n is a Cauchy sequence. Deﬁnition 2 Suppose f 1 , f 2 , . . . is a sequence of functions
R. This sequence
converges pointwise to f :
R if f n (x )
f (x ) for every x
. That is, for every
and > 0, there is an N Z such that f n (x ) f (x ) < for all n N .
2
x
(It may be necessary to choose a different N for each x .) ¥ ¤ ¡ Z
2 ¢ ¢¥ £ ¡ ¥ Deﬁnition 3 We say that f n converges uniformly to f if for each > 0 there is an N
such that f n (x ) f (x ) < for all n N and for all x
. ¢ ¤ ¢ £ Modes of convergence of sequences of r.v.s
Suppose X 1 , X 2 , . . . is a sequence of random variables deﬁned on ( , F , P ). How can we
deﬁne a limit of this sequence? As it turns out, there are several different (and inequivalent)
ways of deﬁning convergence. Almost sure convergence
This is a very strong form of convergence, and usually quite difﬁcult to prove.
n1 converges almost surely (a.s.) to the r.v. X if
: X n (ω) X (ω) ©
§
¥ Deﬁnition 4 A sequence of r.v.s X n
P ( 0 ) 1, where
ω
0 © This is also called convergence with probability 1.
One tool for showing a.s. convergence is the following fact:
Xn
X a.s. if and only if ne nω , ω © [0, 1] and n Z. (0, 1]. n (diverges). ¥ ¥ © 0 a.s. But if P ( 0 ) > 0 then X n doesn’t converge in the
2 © X n (0) 0 for all ω 1. 0) B [0, 1]. Let X n (ω) X n (ω) ¥ So if P ( 0 )
0 then X n
almost sure sense. Xm ¡ §
¦ [0, 1], F £ ¥ Example 1 Let
Note that m >n © ¢ n ©¢ P ( lim sup X n 2 © ECE 6010: Lecture 5 – Sequences and Limit Theorems 2 Meansquare convergence
This is a strong mode of convergence which is usually easier to show than a.s. It is widely
used in engineering. §¦
§ converges to the r.v. X in the meansquare sense if
X )2 ] lim E [( X n £ n 0.
2 We write X n
X (m.s.) or X n
X (q.m.) “quadratic mode.”
2
There is a Cauchy criterion for m.s. convergence: If E [ X n ] <
for all n
X n converges in meansquare if and only if ¥ 0. §
¨¦ Z © © Then
02 P ([1/ n 3 , 1]) n2 1
n3 1
n a . Let 0.
2 Y (a.s.), then X ¥ ¢ © 0 m.s.
So X n
What about a.s. convergence in this case?
Here is an interesting fact: If X n
X (m.s.) and X n © n 2 P ([0, 1/ n 3 ]) £ 2
E[Xn] © © ¡ n ω [0, 1/ n 3 ], n
0 otherwise. ¡ X n (ω) b © £ B [0, 1], and P is uniform: P ([a , b]) ¥ X n )2 ] Z , then ¢ £¢© [0, 1], F Example 2 Let m >n © ¥ lim sup E [( X m n ¡ n1 © Deﬁnition 5 The sequence X n Y (a.s.). ¥ ¥ Convergence in Probability ¢
§ converges to X in probability (i.p.) if ¢ 2 ¡ n ω [0, 1/ n ]
0 otherwise. © © © 0 (a.s.), but X n does not converge in m.s.
P ([0, 1/ n ]) © ¢£ ¢ 0 (i.p.) 0> ) © ¥ P( Xn ¥ so X n 1. B [0, 1], and P is uniform. Let
Xn Note: X n ) [0, 1], F £ Example 3 Let X 0 1/ n ¥ ¥ P( Xn ¤
¥¢ £ as n
for every > 0.
Equivalently, we say that X>) ¥ P( Xn ¢ n1 ¥ Deﬁnition 6 The sequence X n 0,
2 Convergence in Distribution § Deﬁnition 7 The sequence X n n 1 converges in distribution (or in law) to the random
FX (x ) at all continuous points of FX .
2
variable X if FX n (x )
Example 4
[0, 1], F B [0, 1], P is uniform. Let
1 ω 1/ n
0 ω < 1/ n . ¤ © ¡ © Xn ¥ © ECE 6010: Lecture 5 – Sequences and Limit Theorems 3 Then
(1 1/ n )u (x £ ¢ (1/ n )u (x ) 1) £ © FX n ( x ) Then X n
X , where FX (x ) u (x 1). (Draw the distributions.)
2
Note from this example that the X n values don’t really “approach” any value — the values
are still 1 and 0. This is in distinction to the ﬁrst three modes of convergence, in which
Xn X
0 in some sense.
By the deﬁnition of this mode of convergence, we don’t have to worry about the points
of discontinuity of FX n .
Example 5 Let X n 1/ n for all ω
. (So it doesn’t matter what the underlying P is.)
The pointwise convergence is £ © ¥ ¥ ¢ £ © ¡ (not a c.d.f.) © ¤ §
¦ ¡ (Why isn’t this a c.d.f?)
Take 1 x >0
0 x 0. FX ( x ) 1x 0
0 x < 0. ¤ ¢ lim FX n (x ) n © This is a c.d.f, but different from lim FX n . However the difference is at a point of discontinuity.
0 (in distribution).
2
Hence X n ¥ Why and Which?
We have deﬁned several different modes of convergence. Why so many? The basic answer
is that they are inequivalent — one does not meet all the analytical needs. Some are stronger
than others.
Xn
X (m.s.)
Xn
X (i.p.) Xn X (i.p.) X (in distribution) ¥ ¥ X (i.p..) Xn ¥ ¥ ¥ Xn X (a.s.) ¥ Xn So convergence in distribution is weaker than i.p., m.s. or a.s.
In general, none of the implications can be reversed. And m.s. and a.s. do not imply
each other. (Venn diagram – dist. on the outside, then i.p., with m.s. and a.s. overlapping
inside.)
Proof of X n
X (m.s.)
Xn
X (i.p.)
By Chebyshev’s inequality, for every > 0, P ( X n X > )
E [( X n X )2 ]/ 2 .
2]
So if E [( X n X )
0, then P ( X n X > )
0 for all .
2 ¢ ¤ £ ¢ £ ¢ ¥ ¥ ¢ £ ¥ ¥ £ Proof of X n
X (a.s.)
Xn
X (i.p.)
Choose > 0. Write Bn
ω
supm n X m (ω) X (ω) > . This is a decreasing
family of sets: Suppose n 1 < n 2 and ω
Bn2 . Then by the deﬁnition, ω
Bn1 , so
Bn 2
Bn 1 .
Note that n 1 Bn limn
Bn .
Now consider the set Z
ω
: X n (ω)
X (ω) . For a given we see that
Bn is a subset of Z . Since X n
X a.s., we have P ( X n
X ) 0. Thus
n1
¤ ¥ ¤ 0 © © X) ¡ ¥ © ¢ ¤ ¥ ¥ ¢ ¥ P(Xn ¢ £ ¤ ¥ ¦ ©
§ © § £ § £ n ¢ P (lim Bn ) ¢ £
£¢ ¢ ¢
Now notice that ω : X n X >
Bn (since we are looking at only one point, and
0.
not supm n . So P ( X n X > ) P ( Bn ), which we just showed
2 ¥ ¤ ¡ ECE 6010: Lecture 5 – Sequences and Limit Theorems 4 Proof of X n
X (i.p.)
Xn
X (in distribution)
Suppose X n
X (i.p.). Choose > 0 and let x be a continuity point of FX . Then ¥ x) x) x , Xn P(X P(X > x £ , Xn , Xn ¤ ¤ ¤ ¢¤ © £ ¢
¥¡ £ ¤ x £ ¤ P(X x P(X ¢
£¡ ) ¤ © © £ FX n ( x ) x ¢¤ P(X ¤ ¥ ) £ ¥ FX ( x , X n > x ).
x ). Solving for the bracketed term in the second and substituting it into the ﬁrst we obtain ¢ ¢ §
¦
§
¦ ¢
¤ lim FX n (x ). n ¤ FX ( x ¢ £
¢ §
¦ ¤ n 0, so §
¦ ¤ lim FX n (x ) n lim FX n (x ) £
¤ ) X>) £ ¤ ¢ ¢ ©¢ ¢ £ ¢ £ FX ( x ¢ ¢ P( Xn © ¢ ¢© ¢ Combining these, ) X > ). £ © FX ( x ) ¢ ¢ Similarly also ¢£ £ ¤ £ FX ( x 0. Then
X > ). x > ). P( Xn Since we have convergence in probability, lim n P( Xn x) ¤£ ) £ FX ( x δ2 with δ1 > 0 and δ2
, Xn > x ) P( Xn ¤ £ © FX n ( x ) ¤ Similarly, FX n ( x ) , Xn ¢ ¢ ) ¢ ¢ FX ( x x X> £ ¢ x δ1 and X
x
δ2 > ) so that P ( X Xn ¢ , Xn > x P(X > x ¢ ¤£ , Xn > x)
, X n > x ). ¤ x £ ¤ (for example, let X n
Xn X
δ1
Thus X x
x £ Observe that P(X
P(X £ FX n ( x )
FX n ( x ) £ ) ¤ FX ( x ). Since x is a continuity point of FX and is chosen arbitrarily, we can write ¤ lim FX n (x ) n §
¦ FX ( x ) ¤ © ) FX ( x ) £ FX (x ). (Convergence in distribution.) lim FX (x ¦© ¥ ¦ So FX n (x ) 0 0 ¢ lim FX (x ).
2 Some examples of invalid implications
To see which modes are “stronger” than others, we can consider some counterexamples.
X (i.p.) Can we say that X n
([0, 1], B [0, 1],uniform). Let X (m.s.)? n ω [0, 1/ n ]
0 otherwise.
so X n
m.s. ¥ ¤ © © 2
We’ve shown that X n
0 (i.p.), but E [ X n ]
Since X n
0 (a.s.), we also see that a.s. 0 (m.s.).
¤ ¡ ¥ Xn ¥ Example 6 Let X n
Let ( , F , P ) 2 ¥ ¥ ¥ Example 7 Does i.p. imply a.s.? Deﬁne a sequence of r.v.s as follows on
[0, 1]:
X 1 (ω)
1. For X 2 , X 3 , divide into two parts [0, 1/2), (1/2, 1], with X 2 (ω)
1 on the
ﬁrst half, and X 3 (ω)
1 on the second half. For X 4 , X 5 , X 6 , X 7 split into fourths, with
X 4 (ω) 1 on the ﬁrst fourth, etc. © © ¥ © ¢£ ¢ © © which decreases (at a rate approximately 1/ log n ) as n 1)
. So X n ¥ P(Xn © 0> ) 0 (i.p.) © P( Xn ECE 6010: Lecture 5 – Sequences and Limit Theorems 5 However, for a.s. convergence, we see that X n alternates (non uniformly) between 0
and 1. So X n
0 (a.s.)
Note that this example also converges in m.s. because the 2nd moment is P ( X n 1)
1/ log2 (n )
0.
2
¤ © ¥ ¥ Fx for all n . © Example 8 What about convergence in distribution and convergence i.p.?
Let X N (0, 1), and X n ( 1)n X . Note that X n N (0, 1). So FX n
But
P ( X > /2) n odd
P( Xn X > )
0
n even ¡ ¢¢ ¡ £© ¡ ¢ © ¢ £ So it does not
0 for all ; it alternates.
All the other modes of convergence depend on joint distributions, but convergence in
distribution depends on marginals, which don’t tell us the whole picture.
2 ¥ Some other relationships: § k1 such that limk X nk §
¦ X (i.p.) then there is a subsequence X nk ¢¢
¤ ¥ 2. If X n
X and there is a r.v. Y with ﬁnite second moment such that X n
X (m.s.).
for every n Z , then X n ¥ ¡ ¥ C (in distribution), then X n Y (a.s.) C (i.p.) ¥ ¥ 3. If X n X © 1. If X n
(a.s.) Limit Theorems
Laws of Large Numbers ¢ Suppose X 1 , X 2 , . . . , is a sequence of r.v.s. We are often interested in sums n 1 X i , as n
i
becomes large. What can we say about such sums?
Suppose all X i have the same means µ, E [ X i ] µ, and are uncorrelated. We would
1
expect the average n n 1 X i to “approach” µ in some way as n
. If var(x i ) < ,
i
consider
n
1
X i µ.
n © £ £ (Xi µ)2 ]
µ)]
1
n2 n cov( X i , X j ) © £ cov( X i , X j )
j £
£ var( X i ) i1 µ)( X j £ £ £ i
n (Xi
j £ i £ £ 1
n2 i £ © © 1
E[
n2 (Xi 1
E[
n2 © © 1
n2 µ) i £ ¦
¤¥
E ¨ £ © © ¨ µ 2 © ¢ £ £ Xi i1 1
n ¥ 2 n ¦
§¤¥
E 1
n Let us look at m.s. convergence; i1 i1 ECE 6010: Lecture 5 – Sequences and Limit Theorems 6 Summarizing: If E [ X i ] µ and X i are mutually uncorrelated and have ﬁnite variance,
n
1
0, so that
i 1 var( X i )
n 1
n n Xi ¥ £ i1 µ(m.s.) Xi £ n i1 µ(i.p.). ¥ © ¥ 1
n ¢ This is an example of a weak law of large numbers. §
§
§ §
Deﬁnition 8 Suppose X i i 1 is a sequence of r.v.s and bi i 1 is a sequence of reals
diverging to . Then X i i 1 satisﬁes a weak law of large numbers (WLLN) if there is
another sequence ai i 1 of real numbers such that
n £ Xi 0(i.p.).
2 n and ai µ. © In the example we just gave, bn ai £ i1 ¥ 1
bn © Deﬁnition 9 A strong law of large numbers is the same as the preceding deﬁnition, except
that convergence is almost sure (a.s.).
2 Kolmogorov’s Strong Law
Deﬁnition 10 An inﬁnite sequence of r.v.s is independent if every ﬁnite subcollection of
the r.v.s is independent.
2
Theorem 1
(Kolmogorov’s Strong Law) Suppose X n n 1 is a sequence of independent r.v.s with ﬁnite
means for each i . If
n
var( X i )
<
bi2
i1 § £ then µi bn µ, then Kolmogorov’s law implies:
1
n n Xi var( X i )
<
i2 £ © i1 n
i1 £ n 0(a.s.), i1 µ (a.s.). ¥ n and µn © Example 9 If bn © an an ¢ i1 £ Xi £ where n ¥ 1
bn 2 Note that in the case that all the variances are bounded, e.g. 1
<
i2 σ2 £ ¤ §£
i1 var( X i )
i2 for all i i1 then var( X i ) < σ 2 < So, if the variances grow sublinearly, the theorem can apply.
We can get an even stronger conclusion: ECE 6010: Lecture 5 – Sequences and Limit Theorems 7 Theorem 2
Kinchine’s Strong Law of Large Numbers.
Suppose X i i 1 is an i.i.d. sequence (i.e., a sequence of i.i.d. r.v.s) with ﬁnite mean ¢¢ µ< . ¢ © E[Xi ] § ¢ Then the sample mean converges almost surely to the ensemble mean:
n £ Xi µ (a.s.) ¥ 1
n i1 Proving these types of theorems
The proofs follow from more general limit theorems. be a sequence of events. The limit superior (lim sup) of A n § n1 © £ n Ak kn lim sup An § § n1 Deﬁnition 11 Let An
is This is the set of all points that are in A n inﬁnitely often.
2
ω is in inﬁnitely many of the sets A n . (It keeps coming back.)
So ω lim supn An
Another notation is: lim supn An
An i.o. (inﬁnitely often).
We observe that if A n A or An A then An (i.o.) A. © ¢ © ¡ Lemma 1
The Borel Cantelli lemma. [This is frequently a good problem for math qualiﬁers.] ¢ § i1 P ( An ) © § n 1 are independent events and 0.
then P ( A (i.o.)) ¢ § 2. (Conversely) If A n
1. 0. That is, P ( A n ) Proof
for all n . So © § £ § k1 0 if kn k1 0
P ( Ak ) < . § ¢ ¥ 2. Using DeMorgan’s law, © Pick n and N with n < N . Consider . So P ( An (i.o.)) [ An (i.o.)]c P ( Ak ) kn §
¢ ¤ P ( Ak ) < k n Ak ) k1 § if ¤ £ as n P( §£ P ( An (i.o.)) ¥ ¢
§ ¥¡ § ¤ An (i.o.) k n Ak ¢ Ak kn n1 § Ac .
k N P ( Ac ) (by independence)
k c
n Ak ) © kn
N ¤
£ £ kn £ N P ( Ak ) since 1 P ( Ak )]. kn N P ( Ak )] £ lim exp[ §
¦ N P ( Ak ) diverges too, and thus kn £ © P ( Ak ) diverges, then kn ¥ © £
§¢
£ § k1 exp[ e kn P ( Ak ) 0. £ £ ¢ If N 1 x ¤ § k £ P( e 1. x © then P ( An (i.o.)) © P ( An ) < i1 ¥ 1. If ECE 6010: Lecture 5 – Sequences and Limit Theorems 8 So
lim P ( N
c
k n Ak ) c
k n Ak ) 0 §
¦ § £ P(
for all n . © £ for all n , i.e., 0 © N Now, lim supn An is just the union of all of those intersections, so §£
¤ Ac )
k P( c
k n Ak )
£ § £ § n1 1. © so that P ( An (i.o.)) kn § n1 0. © P( 2 Kolmogorov’s Inequality
Suppose X 1 , X 2 , . . . , are independent with zero means and ﬁnite variances. Deﬁne Sn to
be the running sum
Xk © k1 ¤¢ ¢ P ( max Sk ¤ Then for each α > 0,
1kn n £ Sn α) 1
var( Sn ).
α2 This is a lot like the Chebyshev inequality, but instead of looking at the variance of all of
the terms, we simply look at the variance of the last one. Central Limit Theorems
Theorem 3
Central Limit Theorem Suppose X n is a sequence of i.i.d. random variables with mean
mu <
and variance σ 2 < . Then n µ) n i1 £ (Xi x ¥¨ £ 1
n ¢ ¡ ¦
P x e § ¡ That is, N (0, σ 2 ). X ¤ £ where X (in distribution) µ) ¡ £ (Xi
i1 ¥ 1
n t 2 /2σ 2 dt The main point: Sums of i.i.d. random variables tend to look Gaussian.
To work our way up to this, here are a couple of lemmas:
Lemma 2
Suppose X n is a sequence of r.v.s with characteristic functions φn . If there exists a r.v.
X with ch.f. φ such that
lim φn (u ) φ(u ) ¥ Xn §
¦ R then © for all u X (in distribution). n ECE 6010: Lecture 5 – Sequences and Limit Theorems u2
( E [ X 2]
2 δ(u )) © ¦ Proof of the Central Limit Theorem. For convenience (w.o.l.o.g.), take µ
n
1
Sn
i 1 X i . Then
n
n £ X i )] E [exp[i u / n X i ]] © i © ¢ © ¡ ©
© [φ X (u / n )]n © [1 E [exp(i u / n £ E [exp(i uSn )] i1 ¡ (u / n )2 /2( E [ X i2 ]
¡ δ(u / n ))]n
¡ ¢ £ u 2 /2n (σ 2 ¢ ¡ i u / nµ δ(u / n ))]n
¡ £ ¢ [1 0. Deﬁne ¡ φ Sn (u ) © © 0. i u E[X ] ¢ 0 δ(u ) 1 £ where limu φ X (u ) . Then φ X has the expansion ¢ Lemma 3
Suppose X is a r.v. with E [ X 2 ] < 9 © From “elementary” calculus we recall that ¢ Thus a n )n ¥ (1 elim nan ¡
¡ δ(u / n )))
¢ ¢ §
£ ¦ n © exp lim ( u 2 /2(σ 2 exp( σ 2 u 2 /2). £ ¥ φ Sn (u ) This is the form of a characteristic function of a Gaussian (with zero mean).
Summarizing, if X k ahas zero mean and variance 1, £ k1 £ Xk ¥ n k1 Xk 1
n n ¥ 1
n 0 (a.s.) N (0, 1) (in distribution) 2 ¡ ...
View
Full
Document
This note was uploaded on 03/01/2012 for the course ECE 6010 taught by Professor Stites,m during the Spring '08 term at Utah State University.
 Spring '08
 Stites,M

Click to edit the document details