Discrete-time stochastic processes

# If the distribution function fx x of a rv x has a

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: er continuous nor discrete. If FX (x) has a discontinuity at some xo , it means that there is a discrete probability at xo equal to the magnitude of the discontinuity. In this case FX (xo ) is given by the height of the upper point at the discontinuity. (PMF) at xi and denoted by pX (xi ); such a random variable is called discrete. The distribution function of a discrete rv is a ‘staircase function’ staying constant between the possible sample values and having a jump of magnitude pX (xi ) at each sample value xi . If the distribution function FX (x) of a rv X has a (ﬁnite) derivative at x, the derivative is called the probability density (or just density) of X at x and denoted by fX (x); for suﬃciently small δ , δ fX (x) then approximates the probability that X is mapped into a value between x and x + δ . If the density exists for all x, the rv is said to be continuous. Elementary probability courses work primarily with the PMF and the density, since they are convenient for computational exercises. We will often work with the distribution function here. This is partly because it is always deﬁned, partly to avoid saying everything thrice, for discrete, continuous, and other rv’s, and partly because the distribution function is most important in limiting arguments such as steady-state time-average arguments. For distribution functions, density functions, and PMF’s, the subscript denoting the rv is usually omitted if the rv is clear from the context. The same convention is used for complex, vector, etc. rv’s. 1.3.4 Multiple random variables and conditional probabilities Often we must deal with multiple random variables (rv’s) in a single probability experiment. If X1 , X2 , . . . , Xn are n rv’s or the components of a vector rv, their joint distribution function is deﬁned by FX1 ,...,Xn (x1 , x2 , . . . , xn ) = Pr {ω ∈ ≠ | X1 (ω ) ≤ x1 , X2 (ω ) ≤ x2 , . . . , Xn (ω ) ≤ xn } . (1.15) This deﬁnition goes a long way toward explaining why we need the notion of a sample space ≠ when all we want to talk about is a set of rv’s. The distribution function of a rv fully describes the individual behavior of that rv, but ≠ and the mappings are needed to describe how the rv’s interact. For a vector rv X with components X1 , . . . , Xn , or a complex rv X with real and imaginary parts X1 , X2 , the distribution function is also deﬁned by (1.15). Note that {X1 ≤ x1 , X2 ≤ x2 , . . . , Xn ≤ xn } is an event and the corresponding probability is nondecreasing in each argument xi . Also the distribution function of any subset of random variables is obtained 1.3. PROBABILITY REVIEW 13 by setting the other arguments to +1. For example, the distribution of a single rv (called a marginal distribution) is given by FXi (xi ) = FX1 ,...,Xi−1 ,Xi ,Xi+1 ,...Xn (1, . . . , 1, xi , 1, . . . , 1). If the rv’s are all discrete, the joint PMF is given by Pr {X1 = x1 , . . . , Xn = xn }. Similarly, n( ...,x if a joint probability density f (x1 , . . . , xn ) exists, it is given by the derivative @ xF @x1 ,···@ xn ) . @ 1 x2 n Two rv’s, say X and Y , are statistical ly independent (or, more brieﬂy, independent ) if FX Y (x, y ) = FX (x)FY (y ) for each x ∈ R, y ∈ R. (1.16) If X and Y are discrete rv’s then the deﬁnition of independence in (1.16) is seen to be equivalent to pX Y (xi yj ) = pX (xi )pY (yj ) for each value xi of X and yj of Y . Since X = xi and Y = yj are events, the conditional probability of X = xi conditional on Y = yj (assuming pY (yj ) &gt; 0) is given by (1.10) to be pX |Y (xi | yj ) = pX Y (xi , yj ) . pY (yj ) If pX |Y (xi | yj ) = pX (xi ) for all i, j , then it is seen that X and Y are independent. This captures the intuitive notion of independence better than (1.16) for discrete rv’s , since it can be viewed as saying that the PMF of X is not aﬀected by the sample value of Y . If X and Y have a joint density, then (1.16) is equivalent to fX Y (xy ) = fX (x)fY (y ) for each x ∈ R, y ∈ R. If fY (y ) &gt; 0, the conditional density can be deﬁned as fX |Y (x | y ) = statistical independence can be expressed as fX |Y (x|y ) = fX (x) where fY (y ) &gt; 0. fX Y (x,y ) fY (y ) . Then (1.17) This captures the intuitive notion of statistical independence for continuous rv’s better than (1.16), but it does not quite say that the density of X , conditional on Y = y is the same as the marginal density of X . The event Y = y has zero probability, and we cannot condition on events of zero probability. If we look at the derivatives deﬁning these densities, the conditional density looks at the probability that x ≤ X ≤ x + δ given that y ≤ Y ≤ y + ≤ in the limit δ, ≤ → 0. At some level, this is a very technical point and the intuition of conditioning on Y =y works very well. In fact, problems are often directly modeled in terms of conditional probabilities so that viewing a conditional density as a derivative is less relevant. We...
View Full Document

## This note was uploaded on 09/27/2010 for the course EE 229 taught by Professor R.srikant during the Spring '09 term at University of Illinois, Urbana Champaign.

Ask a homework question - tutors are online