The product of the conditional density
f
Y

X
(
y

x
) and the marginal density
f
X
(
x
)
is the joint density
f
XY
(
x, y
), so we now have
E
(
E
(
g
(
Y
)

X
))
=
Z
∞
∞
Z
∞
∞
g
(
y
)
f
XY
(
x, y
)d
x
d
y
=
Z
∞
∞
g
(
y
)
Z
∞
∞
f
XY
(
x, y
)d
x
d
y
=
Z
∞
∞
g
(
y
)
f
Y
(
y
)d
y
=
E
(
g
(
Y
))
.
A similar argument can be used to justify the law of iterated expectations when
the pair (
X, Y
) is discrete.
8
Convergence in probability
Suppose we have an infinite sequence of random variables
Z
1
, Z
2
, Z
3
, . . .
. Imagine
that, as we move along this sequence, the random variables
Z
n
start to settle down
to some constant value. For instance, it might be the case that the
n
th variable in
our sequence,
Z
n
, has the uniform distribution on (1

1
/n,
1 + 1
/n
). In this case,
as
n
increases, the distribution of
Z
n
becomes more and more tightly concentrated
around one. In some sense,
Z
n
converges to one as
n
→ ∞
. But since each
Z
n
is
random, we need to find a suitable way to define this convergence that involves a
statement about the random behavior of the different
Z
n
’s.
A sequence of random variables
Z
1
, Z
2
, Z
3
, . . .
is said to
converge in probability
to some constant
c
if, for every strictly positive number
ε >
0, we have
lim
n
→∞
P
(

Z
n

c

> ε
) = 0
.
9
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
In words, this means that for any positive
ε
we can think of – no matter how small
– the probability that
Z
n
differs from
c
by more than
ε
will eventually decrease to
zero as
n
increases toward infinity. If
Z
1
, Z
2
, Z
3
, . . .
converges in probability to
c
,
we say that
Z
n
→
p
c
as
n
→ ∞
.
Returning to our example where
Z
n
∼
U
(1

1
/n,
1 + 1
/n
), let us verify that in
fact
Z
n
→
p
1. Choose any number
ε >
0. We need to show that lim
n
→∞
P
(

Z
n

1

> ε
) = 0. Since
Z
n
∼
U
(1

1
/n,
1+1
/n
), it must lie in the interval [1

1
/n,
1+
1
/n
] with probability one.
If
n >
1
/ε
, then the interval [1

1
/n,
1 + 1
/n
] lies
strictly inside the interval [1

ε,
1+
ε
]. Therefore, we will have
P
(

Z
n

1

> ε
) = 0
for all
n >
1
/ε
. No matter how small we chose
ε
, as
n
→ ∞
we will eventually
have
n >
1
/ε
. This shows that lim
n
→∞
P
(

Z
n

1

> ε
) = 0.
Let us consider another example that is slightly less trivial. Suppose this time
that the
n
th random variable in our sequence,
Z
n
, is continuous with pdf
f
n
given
by
f
n
(
x
) =
(2
n

1)
x

2
n
for
x
≥
1
0
for
x <
1
.
The pdf
f
n
is highest when
x
= 1, where it is equal to 2
n

1.
As we move
rightward along the
x
axis from one,
f
n
(
x
) decays smoothly to zero at the rate
x

2
n
. We can see that when
n
increases,
f
n
spikes higher and sharper at one, and
decays more rapidly to zero as we move along the
x
axis. In a sense, the sequence
of pdfs
f
n
are gathering into an infinitely tall spike at
x
= 1 as
n
→ ∞
.
This
is what we would expect if
Z
n
→
p
1, which is in fact the case, as we will now
show. Choose any number
ε >
0. Since
f
n
(
x
) = 0 for
x <
1, it must be true that
P
(

Z
n

1

> ε
) =
P
(
Z
n
>
1 +
ε
). Therefore,
P
(

Z
n

1

> ε
) =
Z
∞
1+
ε
(2
n

1)
x

2
n
d
x
=

x
1

2
n
∞
1+
ε
= (1 +
ε
)
1

2
n
.
This is the end of the preview.
Sign up
to
access the rest of the document.
 Spring '08
 Stohs
 Normal Distribution, Probability theory, probability density function

Click to edit the document details