The mean is a way to visualize where the center of the distribution lies, and is one of the most important summary
statistics for a distribution.
In addition to computing the expectation of a random variable, we can also compute the expectation of a
function
of a random variable.
Definition 1.11.
The
expectation of a function of a random variable
g
(
X
)
is given by
E
(
g
(
X
)) =
Â
x
2
S
g
(
x
)
f
(
x
)
(for a discrete distribution)
E
(
g
(
X
)) =
Z
S
g
(
x
)
f
(
x
)
dx
(for a continuous distribution)
Note that in this definition,
g
(
X
)
can contain variables other than
X
, but it can only contain one
random variable
,
X
.
Finally, we will round off our discussion with the topic of percentiles and the median.
Definition 1.12.
The
100
p
th percentile
of a distribution is any value
p
p
such that the CDF
F
(
p

p
)
p
F
(
p
p
)
.
The
median
is defined to be the 50th percentile,
p
0
.
5
.
The median is another useful measure of centrality. Compared to the mean, it is more robust (less sensitive) to the
presence of outliers.
Remark.
If the CDF
F
(
x
)
is continuous at
p
p
, then the 100
p
th percentile is simply a value
p
p
that satisfies
F
(
p
p
) =
p
.
Any continuous distribution will be continuous at
p
p
, regardless of
p
.
If the CDF
F
(
x
)
is not continuous at
p
p
, then we may need to use limits as in our definition. The minus sign in the
exponent of
F
(
p

p
)
signifies the limit of
F
(
x
)
as
x
approaches
p
p
from the left.
Note that while a continuous distribution can have only 1 value for the 100
p
th percentile, a discrete distribution
may have an interval of values corresponding to the 100
p
th percentile.
The following example illustrates all three points above for the case of
p
0
.
5
, the median, but the idea extends to all
percentiles.
Example 1.13.
Examine closely the 3 CDFs in the figure below. To satisfy your curiosity, Distribution 1 is a con
tinuous distribution, namely the standard normal distribution (which we define in Section 1.2). The other two are
discrete distributions, which you should recognize because of the discontinuous nature of their CDFs. Distribution 2
is a Unif
(
{
1
,
2
}
)
and Distribution 3 is a Unif
(
{
1
,
3
,
5
}
)
.
11
−
4
−
2
0
2
4
0.0
0.2
0.4
0.6
0.8
1.0
Distribution 1
x
F(x)
(a) A continuous distribution. Median
=
0.
0.0
0.5
1.0
1.5
2.0
2.5
3.0
0.0
0.2
0.4
0.6
0.8
1.0
Distribution 2
x
F(x)
●
●
(b) A discrete distribution. The set of
medians is an interval
[
1
,
2
)
.
0
2
4
6
0.0
0.2
0.4
0.6
0.8
1.0
Distribution 3
x
F(x)
●
●
●
(c) A discrete distribution. Median
=
3.
The graphical procedure for computing
p
p
is to plot
F
(
x
)
, draw a horizontal line at
F
(
x
) =
p
, and read off the
corresponding values of
x
where the horizontal line intersects the CDF. Using this procedure, we find that the median
for Distribution 1 is 0 and that the medians for Distribution 2 are the entire set
[
1
,
2
)
. (Note that the ‘)’ to the right of
‘2’ indicates that the value of 2 is not included in this interval, since
F
(
2
) =
1
>
0
.
5).
You've reached the end of your free preview.
Want to read all 14 pages?
 Fall '15
 Normal Distribution, Probability theory, CDF