This preview shows page 1. Sign up to view the full content.
Unformatted text preview: ways:
Z1
Z1
E [Y ] =
y dFY (y ) =
g (x) dFX (x).
(1.25)
−1 −1 The integral in this equation is called a Stieltjes integral,15 which for our purposes is shortR
P
hand for g (x)f (x) dx if a density f exists and
x g (x)p(x) if a PMF p exists. The
15 To be a little more careful, we can view R1 −1 g (x)dF (x) as limδ→0 P n g (nδ )[F (nδ ) − F (nδ − δ )]. 20 CHAPTER 1. INTRODUCTION AND REVIEW OF PROBABILITY R
integral is alwaR linear in the sense that if g (x) = a(x) + b(x), then g (x) dF (x) =
ys
R
a(x) dF (x) + b(x) dF (x). Usually it can be calculated using some variation on (1.24). Particularly important examples of such expected values are the moments E [X n ] of a rv
£
§
X and the central moments E (X − X )n of X where X is the mean E [X ]. The second
2
central moment is called the variance, denoted by VAR(X ) or σX . It is given by
£
§
£§
2
VAR(X ) = E (X − X )2 = E X 2 − X . (1.26) The standard deviation of X , σX , is the square root of the variance and provides a measure
of dispersion of the rv around the mean. Thus the mean is a rough measure of typical
values for the outcome of the rv, and σX is a measure of the typical diﬀerence between X
and X . There are other measures of typical value (such as the median and the mode) and
other measures of dispersion, but mean and standard deviation have a number of special
properties that make them important.§ One of these (see Exercise 1.20) is that E [X ] is the
£
value of x that minimizes E (X − x)2 .
If g is a strictly monotonic increasing function, then the distribution function of Y = g (X )
can be found by noting that for any x, the event {X ≤ x} is the same as the event
{Y ≤ g (x)}. Thus
FY (g (x)) = FX (x) FY (y ) = FX (g −1 (y )). or (1.27) If X is a continuous rv and g a continuously diﬀerentiable function, then the density of
Y = g (X ) is related to the density of X by
g 0 (x)fY (g (x)) = fX (x) or fY (y ) = fX (g −1 (y ))
.
g 0 (g −1 (y )) (1.28) Another way to see this is to observe that the probability that X lies between x and x + ≤
is, to ﬁrst order, ≤fX (x). This is the same as the probability that Y lies between g (x) and
g (x + ≤) ≈ g (x) + ≤g 0 (x). Finally, if g is not monotonic then (1.27) must be modiﬁed to take
account of the possibility that the event Y ≤ y corresponds to multiple intervals for X . If X and Y are rv’s, then the sum Z = X + Y is also a rv. To see this, note that X (ω ) ∈ R
and Y (ω ) ∈ R for every ω ∈ ≠ and thus, by the axioms of the real numbers, Z (ω ) is in R
and is thus ﬁnite.16 In the same way, the sum Sn = X1 + · · · + Xn of any ﬁnite collection
of rv’s is also a rv.
If X and Y are independent, then the distribution function of Z is given by
Z1
Z1
FZ (z ) =
FX (z − y ) dFY (y ) =
FY (z − x) dFX (x).
−1 If X and Y both have densities, this can be rewritten as
Z1
Z1
fZ (z ) =
fX (z − y )fY (y ) dy =
fY (z − x)fX (x) dx.
−1 16 (1.29) −1 (1.30) −1 We usually extend the deﬁnition of a rv by allowing it to be undeﬁned over a set of probability 0, but
the sum is then undeﬁned only over a set of probability 0 also. 1.3. PROBABILITY REVIEW 21 Eq. (1.30) is the familiar convolution equation from linear systems, and we similarly refer
to (1.29) as the convolution of distribution functions (although it has a diﬀerent functional
form from (1.30)). If X and Y are nonnegative random variables, then the integrands in
(1.29) and (1.30) are nonzero only between 0 and z , so we often use 0 and z as the limits
in (1.29) and (1.30). Note, however, that if Pr {Y = 0} 6= 0, then the lower limit must be
0− , i.e., in terms of (1.29), it must include the jump in FY (y ) at y = 0. One can avoid
confusion of this type by always keeping inﬁnite limits until actually calculating something.
If X1 , X2 , . . . , Xn are independent rv’s, then the distribution of the rv Sn = X1 + X2 + · · · +
Xn can be found by ﬁrst convolving the distributions of X1 and X2 to get the distribution
of S2 and then, for each i ≥ 2, convolving the distribution of Si and Xi+1 to get the
distribution of Si+1 . The distributions can be convolved in any order to get the same
resulting distribution.
Whether or not X1 , X2 , . . . , Xn are independent, the expected value of Sn = X1 + X2 +
· · · + Xn satisﬁes
E [Sn ] = E [X1 + X2 + · · · + Xn ] = E [X1 ] + E [X2 ] + · · · + E [Xn ] . (1.31) This says that the expected value of a sum is equal to the sum of the expected values,
whether or not the rv’s are independent (see exercise 1.11). The following example shows
how this can be a valuable problem solving aid with an appropriate choice of rv’s.
Example 1.3.4. In packet networks, a packet can be crudely modeled as a string of IID
binary digits with Pr {0} = Pr {1} = 1/2. Packets are often separated from each other by a
special bit pattern, 01111110, called a ﬂag. If this special pattern appears within a packet,
it could be interpreted as...
View Full
Document
 Spring '09
 R.Srikant

Click to edit the document details