In Example 3.4, we have
f
Y
(
y
) =
Z
0
.
5
0
f
X,Y
(
x, y
)
dx
=
Z
0
.
5
0
2
xe

y
dx
=
e

y
,
y >
0
.
In Example 3.5, we have
f
Y
(
y
) =
f
X,Y
(1
, y
) +
f
X,Y
(

1
, y
) =
0
.
5
√
2
π
e

1
2
(
y

1)
2
+
e

1
2
(
y
+1)
2
.
4
Subscribe to view the full document.
Sometimes, we are interested in the conditional random variable
X

Y
=
y
, e.g.
the stock price given
good economy. This is in a sense that we are “living in the world of
Y
=
y
”. Therefore, our universe is (the
sample space of)
Y
=
y
and we should (zoom into the event
Y
=
y
and) rescale our probability such that
P
(All possibilities

Y
=
y
) = 1. In view of this, we define the conditional probability function
f
X

Y
(
x

y
) =
f
X,Y
(
x, y
)
f
Y
(
y
)
.
Note the similarity between this and the conditional probability for events
P
(
A

B
) =
P
(
A
∩
B
)
P
(
B
)
provided that
P
(
B
)
6
= 0.
3.4. Descriptive Statistics
We often want to assess the random variable (a function
F
X
) using a few numbers. The most common
ones are the moments, especially the first 2 moments: the mean (expected value) gives roughly the “location”
and variance (related to the second moment) gives the “spread”.
For discrete random variables, the formula is
E
(
X
) =
X
x
∈
Ω
xf
X
(
x
)
.
Example 3.8.
For
X
∼
Binomial
(2
,
0
.
3)
, we have
f
X
(0) = 0
.
49
, f
X
(1) = 0
.
42
, f
X
(2) = 0
.
09
.
Therefore, the expected value of
X
is given by
E
(
X
) = 0
×
0
.
49 + 1
×
0
.
42 + 2
×
0
.
09 = 0
.
42 + 0
.
18 = 0
.
6
.
For continuous random variables, the formula is
E
(
X
) =
Z
x
∈
Ω
xf
X
(
x
)
dx
=
Z
x
∈
Ω
xdF
X
(
x
)
.
Example 3.9.
For
X
∼
exp(
λ
= 2)
, we have
f
X
(
x
) = 2
e

2
x
, x >
0
.
Its expected value is given by
E
(
X
) =
Z
∞
0
x
2
e

2
x
dx
=
Z
∞
0

xd
(
e

2
x
)
(
Integration by parts
)
= [

xe

2
x
]
x
→∞
x
=0
+
Z
∞
0
e

2
x
dx
=
1
2
Z
∞
0
2
e

2
x
dx
=
1
2
.
5
If furthermore
X
≥
0, then
E
(
x
) =
Z
x
∈
Ω
(1

F
X
(
x
))
dx
where (1

F
X
(
x
)) is sometimes called the survival function, denoted by either
F
X
or
S
X
.
Example 3.10.
In the above example, we have
1

F
X
(
x
) =
e

2
x
and therefore the expected value can be calculated as
E
(
X
) =
Z
∞
0
e

2
x
dx
=
1
2
Z
∞
0
2
e

2
x
dx
=
1
2
.
For multiple random variables, we have
E
(
X

Y
=
y
) =
(
∑
x
:(
x,y
)
∈
Ω
xf
X

Y
(
x

y
)
if
X
is discrete
R
x
:(
x,y
)
∈
Ω
xf
X

Y
(
x

y
)
dx
if
X
is continuous
which is a function of
y
. Note if we do not assign a value to
Y
, the above
E
(
X

Y
) would be a function of
Y
, i.e. a random variable, as we do not know what
Y
is.
Hence,
E
(
X

Y
) is a random variable. Its mean is
E
[
X
], i.e.
E
(
E
(
X

Y
)) =
E
[
X
]
.
(Double expectation formula / Tower property)
This implies the conditional variance formula:
V ar
(
X
) =
E
(
V ar
(
X

Y
)) +
V ar
(
E
(
X

Y
))
.
3.5. Expectation and Related Quantities
Note from the formula of
E
that
E
is linear, i.e. for a real number
a
and 2 random variables
X
and
Y
,
we have
E
(
aX
) =
a
E
(
X
)
,
and
E
(
X
+
Y
) =
E
(
X
) +
E
(
Y
)
.
Variance is defined as
V ar
(
X
) =
E
((
X

E
(
X
))
2
) =
E
(
X
2
)

(
E
(
X
))
2
which captures the dispersion of the random variable
X
.
Covariance between 2 random variables
X
and
Y
is defined as
Cov
(
X, Y
) =
E
((
X

E
(
X
))(
Y

E
(
Y
))) =
E
(
XY
)

E
(
X
)
E
(
Y
)
which measures the degree of (linear) dependence of them. In particular, the Pearson correlation defined as
ρ
(
X, Y
) =
Cov
(
X, Y
)
p
V ar
(
X
)
V ar
(
Y
)
removes the dependencies on the units of
X
and
Y
. The Pearson correlation can only take values between

1
and 1. It is

1 when
Y
=
α
+
βX
for some real number
α
and negative number
β
, i.e. a linear relationship.
Subscribe to view the full document.
 One '16
 Jin Xia Zhu