Question:
Why is
¸
X
called the measure of central tendency?
Suppose
X
1
; X
2
; :::X
n
;
follow the same distribution as
X;
and these random variables are
mutually independent. Then the so-called law of large numbers (LLNs) implies
n
±
1
n
X
i
=1
X
i
!
¸
X
in certain sense if
n
! 1
. See Chapter 6 later.
33

(ii)It is said that the expectation
¸
X
exists for a continuous distribution if and only if
Z
1
±1
j
x
j
f
X
(
x
)
dx <
1
:
Whenever
X
is a bounded random variable, that is, whenever there are numbers
a
and
b
(
²1
< a < b <
1
)
such that
Pr (
a
³
X
³
b
) = 1
, then
¸
X
must exist. For example, recall the
p.d.f. of the Cauchy distribution is
f
X
(
x
) =
1
·
1
1 +
x
2
;
for
² 1
< x <
1
;
then
Z
1
±1
j
x
j
f
X
(
x
)
dx
=
2
·
Z
1
0
x
1 +
x
2
dx
=
1
:
Therefore, the expectation doesnµt exist for the Cauchy distribution.
(iii)If
X
is the stock return,
¸
X
is the expected stock return or long-run average stock return.
(iv) The terminology of expectation has its origin in games of chance. This can be illustrated
as follows. Four small similar chips, numbered 1, 1, 1, and 2, respectively, are placed in a bowl
and are mixed. A player is blindfolded and is to draw a chip from the bowl. If she draws one of
the three chips numbered 1, she will receive one dollar. If she draws the chip numbered 2, she
will receive two dollars. It seems reasonable to assume that the player has a ²
3
4
claim³on the $1
and a "
1
4
claim" on the $2. Her "total claim" is 1
·
3
4
+ 2
·
1
4
=
5
4
= $1
:
25
:
Thus the expectation
of
X
is precisely the playerµs claim in this game.
Theorem:
If
Y
=
aX
+
b;
then
¸
Y
=
a¸
X
+
b:
Remark:
The expectation
E
(
±
)
is a linear operator.
Here,
a
:
scale parameter,
b
:
location
parameter.
Theorem:
Suppose
E
(
X
2
)
exists. Then
¸
X
= arg min
a
E
(
X
²
a
)
2
:
Proof:
dE
(
X
²
a
)
2
da
= 0
:
then
d
h
R
1
±1
(
x
²
a
)
2
f
(
x
)
dx
i
da
=
²
2
Z
1
±1
(
x
²
a
)
f
(
x
)
dx
= 0
so we have
a
=
R
1
±1
xf
(
x
)
dx
R
1
±1
f
(
x
)
dx
=
¸
X
34

Questions
: Does
X
=
¸
X
has the largest probability to occur? Is
P
(
X
=
¸
X
)
the largest?
Answer:
No. e.g., Bernoulli r.v.:
P
(
X
=
¸
= 0
:
5) = 0
gives a minimum probability.
Case II:
g
(
X
) = (
X
²
¸
X
)
2
:
De°nition [Variance of
X
]:
of a r.v. X
°
2
X
=
E
(
X
²
¸
X
)
2
=
° P
x
(
x
²
¸
X
)
2
f
X
(
x
)
;
d.r.v.
R
1
±1
(
x
²
¸
X
)
2
f
X
(
x
)
dx;
c.r.v.
where the summation is over all possible
x
0
s:
The standard deviation of
X
is given by
°
X
=
q
°
2
X
:
Remarks:
(i)
°
2
X
is a measure of the degree of spread of a distribution around its mean. A larger value
of
°
2
X
means that
X
is more variable. It is a scale parameter for the distribution of
X
.
(ii) In economics, it is interpreted as a measure of uncertainty. It is often called a measure
of ²volatility³of
X
.
A larger value of
°
2
X
means that
X
is more variable.
In contrast, at the
extreme, if
°
2
X
= 0
;
then
X
=
¸
X
with probability 1 and there is no variation in
X:
Consider the d.r.v. case:
°
2
X
=
X
x
(
x
²
¸
X
)
2
f
X
(
x
)
=
0
if and only if
(
x
²
¸
X
)
2
f
X
(
x
) = 0
for all
x;
which implies
x
=
¸
X
(no uncertainty)
This is an example of degenerate distributions.