Discrete-time stochastic processes

# If x has both positive and negative values and if at

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: .16 illustrates the advantages of this approach, particularly where it is initially unclear whether or not the expectation is ﬁnite. The following example shows that this approach can sometimes hide convergence questions and give the wrong answer. Example 1.3.5. Let Y be a geometric rv with the PDF fY (y ) = 2−y for integer y ≥ 1. Let X be an integer rv that, conditional on Y , is binary with equiprobable values ±2y given Y = y . We then see that E [X | Y = y ] = 0 for all y , and thus, (1.37) indicates that E [X ] = 0. On the other hand, it is easy to see that pX (2k ) = 2−k−1 for each integer k ≥ 1 and pX (−2k ) = 2−k−1 for each integer k ≤ 1. Thus the expectation over positive values of X is 1 and that over negative values is −1. In other words, the expected value of X is undeﬁned and (1.37) is incorrect. The diﬃculty in the above example cannot occur if X is a non-negative rv. Then (1.37) is simply a sum of a countable number of non-negative terms, and thus it either converges to a ﬁnite sum independent of the order of summation, or it diverges to 1, again independent of the order of summation. If X has both positive and negative components, we can separate it into X = X + +X − where X + = max(0, X ) and X − = min(X, 0). Then (1.37) applies to X + and −X − separately. If at most one is inﬁnite, then (1.37) applies to X , and otherwise X is undeﬁned. This is summarized in the following theorem: 24 CHAPTER 1. INTRODUCTION AND REVIEW OF PROBABILITY Theorem 1.1 (Total expectation). Let X and Y be discrete rv’s. If X is non-negative, P then E [X ] = E [E [X | Y ]] = y pY (y )E [X | Y = y ]. If X has both positive and negative values, and if at most one of E [X + ] and E [−X − ] is inﬁnite, then E [X ] = E [E [X | Y ]] = P y pY (y )E [X | Y = y ]. We have seen above that if Y is a discrete rv, then the conditional expectation E [X |Y = y ] is little more complicated than the unconditional expectation, and this is true whether X is discrete, continuous, or arbitrary. If X and Y are continuous, we can essentially extend these results to probability densities. In particular, Z1 E [X | Y = y ] = x fX |Y (x | y ), (1.36) −1 and E [X ] = Z 1 −1 fY (y )E [X | Y =y ] dy = Z 1 fY (y ) −1 Z 1 −1 x fX |Y (x | y ) dx dy (1.37) We do not state this as a theorem because the details about the integration do not seem necessary for the places where it is useful. 1.3.9 Indicator random variables For any event A, the indicator random variable of A, denoted IA , is a binary rv that has the value 1 for all ω ∈ A and the value 0 otherwise. Thus, as illustrated in Figure 1.6, the distribution function FIX (x) is 0 for x < 0, 1 − Pr {A} for 0 ≤ x < 1, and 1 for x ≥ 1. it is obvious, by comparing Figures 1.6 and 1.3 that E [IA ] = Pr {A}. 1 IA 1 − Pr {A} 0 1 0 Figure 1.6: The distribution function of an indicator random variable. Indicator rv’s are useful because they allow us to apply the many known results about rv’s and expectations to events. For example, the laws of large numbers are expressed in terms of sums of rv’s, and those results all translate into results about relative frequencies through the use of indicator functions. 1.3.10 Transforms The moment generating function (mgf ) for a rv X is given by Z1 £ § gX (r) = E erX = erx dFX (x). −1 (1.38) 1.4. THE LAWS OF LARGE NUMBERS 25 Viewing r as a real variable, we see that for r > 0, gX (r) only exists if 1 − FX (x) approaches 0 at least exponentially as x → 1. Similarly, for r < 0, gX (r) exists only if FX (x) approaches 0 at least exponentially as x → −1. If gX (r) exists in a region of r around 0, then derivatives of all orders exist, given by Ø Z1 @ n gX (r) @ n gX (r) Ø Ø n rx = x e dFX (x) ; = E [X n ] . (1.39) Ø @ rn @ rn Ø −1 r=0 This shows that ﬁnding the moment generating function often provides a convenient way to calculate the moments of a random variable. Another convenient feature of moment generating functions is their use in dealing with sums of independent rv’s. For example, suppose S = X1 + X2 + · · · + Xn . Then h ≥Xn ¥i hYn i Yn £§ gS (r) = E erS = E exp rXi = E exp(rXi ) = gXi (r). (1.40) i=1 i=1 i=1 In the last step here, we have used a result of Exercise 1.11, which shows that for independent rv’s, the mean of the product is equal to the product of the means. If X1 , . . . , Xn are also IID, then gS (r) = [gX (r)]n . (1.41) The variable r in the mgf can also be viewed as a complex variable, giving rise to a number of other transforms. A particularly important case is to view r as a pure imaginary variable, √ say iω where i = −1 and ω is real. The mgf is then called the characteristic function. Since |eiωx | is 1 for all x, gX (iω ) must always exist, and its magnitude is at most one. Note that gX (iω ) is the inverse Fourier transform of the density of X . The Z-transform is the result of replacing er with z in gX (r). This is useful pri...
View Full Document

## This note was uploaded on 09/27/2010 for the course EE 229 taught by Professor R.srikant during the Spring '09 term at University of Illinois, Urbana Champaign.

Ask a homework question - tutors are online