This preview shows page 1. Sign up to view the full content.
Unformatted text preview: er continuous nor
discrete. If FX (x) has a discontinuity at some xo , it means that there is a discrete
probability at xo equal to the magnitude of the discontinuity. In this case FX (xo ) is
given by the height of the upper point at the discontinuity. (PMF) at xi and denoted by pX (xi ); such a random variable is called discrete. The distribution function of a discrete rv is a ‘staircase function’ staying constant between the
possible sample values and having a jump of magnitude pX (xi ) at each sample value xi .
If the distribution function FX (x) of a rv X has a (ﬁnite) derivative at x, the derivative
is called the probability density (or just density) of X at x and denoted by fX (x); for
suﬃciently small δ , δ fX (x) then approximates the probability that X is mapped into a
value between x and x + δ . If the density exists for all x, the rv is said to be continuous.
Elementary probability courses work primarily with the PMF and the density, since they are
convenient for computational exercises. We will often work with the distribution function
here. This is partly because it is always deﬁned, partly to avoid saying everything thrice,
for discrete, continuous, and other rv’s, and partly because the distribution function is
most important in limiting arguments such as steadystate timeaverage arguments. For
distribution functions, density functions, and PMF’s, the subscript denoting the rv is usually
omitted if the rv is clear from the context. The same convention is used for complex, vector,
etc. rv’s. 1.3.4 Multiple random variables and conditional probabilities Often we must deal with multiple random variables (rv’s) in a single probability experiment.
If X1 , X2 , . . . , Xn are n rv’s or the components of a vector rv, their joint distribution function
is deﬁned by
FX1 ,...,Xn (x1 , x2 , . . . , xn ) = Pr {ω ∈ ≠  X1 (ω ) ≤ x1 , X2 (ω ) ≤ x2 , . . . , Xn (ω ) ≤ xn } . (1.15)
This deﬁnition goes a long way toward explaining why we need the notion of a sample space
≠ when all we want to talk about is a set of rv’s. The distribution function of a rv fully
describes the individual behavior of that rv, but ≠ and the mappings are needed to describe
how the rv’s interact.
For a vector rv X with components X1 , . . . , Xn , or a complex rv X with real and imaginary
parts X1 , X2 , the distribution function is also deﬁned by (1.15). Note that {X1 ≤ x1 , X2 ≤
x2 , . . . , Xn ≤ xn } is an event and the corresponding probability is nondecreasing in each
argument xi . Also the distribution function of any subset of random variables is obtained 1.3. PROBABILITY REVIEW 13 by setting the other arguments to +1. For example, the distribution of a single rv (called
a marginal distribution) is given by
FXi (xi ) = FX1 ,...,Xi−1 ,Xi ,Xi+1 ,...Xn (1, . . . , 1, xi , 1, . . . , 1).
If the rv’s are all discrete, the joint PMF is given by Pr {X1 = x1 , . . . , Xn = xn }. Similarly,
n(
...,x
if a joint probability density f (x1 , . . . , xn ) exists, it is given by the derivative @ xF @x1 ,···@ xn ) .
@ 1 x2
n
Two rv’s, say X and Y , are statistical ly independent (or, more brieﬂy, independent ) if
FX Y (x, y ) = FX (x)FY (y ) for each x ∈ R, y ∈ R. (1.16) If X and Y are discrete rv’s then the deﬁnition of independence in (1.16) is seen to be
equivalent to
pX Y (xi yj ) = pX (xi )pY (yj ) for each value xi of X and yj of Y . Since X = xi and Y = yj are events, the conditional probability of X = xi conditional on
Y = yj (assuming pY (yj ) > 0) is given by (1.10) to be
pX Y (xi  yj ) = pX Y (xi , yj )
.
pY (yj ) If pX Y (xi  yj ) = pX (xi ) for all i, j , then it is seen that X and Y are independent. This
captures the intuitive notion of independence better than (1.16) for discrete rv’s , since it
can be viewed as saying that the PMF of X is not aﬀected by the sample value of Y .
If X and Y have a joint density, then (1.16) is equivalent to
fX Y (xy ) = fX (x)fY (y ) for each x ∈ R, y ∈ R. If fY (y ) > 0, the conditional density can be deﬁned as fX Y (x  y ) =
statistical independence can be expressed as
fX Y (xy ) = fX (x) where fY (y ) > 0. fX Y (x,y )
fY (y ) . Then
(1.17) This captures the intuitive notion of statistical independence for continuous rv’s better than
(1.16), but it does not quite say that the density of X , conditional on Y = y is the same as
the marginal density of X . The event Y = y has zero probability, and we cannot condition
on events of zero probability. If we look at the derivatives deﬁning these densities, the
conditional density looks at the probability that x ≤ X ≤ x + δ given that y ≤ Y ≤ y + ≤
in the limit δ, ≤ → 0. At some level, this is a very technical point and the intuition of
conditioning on Y =y works very well. In fact, problems are often directly modeled in
terms of conditional probabilities so that viewing a conditional density as a derivative is
less relevant.
We...
View
Full
Document
This note was uploaded on 09/27/2010 for the course EE 229 taught by Professor R.srikant during the Spring '09 term at University of Illinois, Urbana Champaign.
 Spring '09
 R.Srikant

Click to edit the document details