Ch_2_Random_Variables.pdf - Introduction to Stochastic...

This preview shows page 1 out of 58 pages.

Unformatted text preview: Introduction to Stochastic Processess II: Random Variables 2 Random Variables 2.1. Random Variables It frequently occurs that in performing an experiment we are mainly interested in some functions of the outcome as opposed to the outcome itself. For instance, in tossing dice we are often interested in the sum of the two dice and are not really concerned about the actual outcome. That is, we may be interested in knowing that the sum is seven and not be concerned over whether the actual outcome was (1,6) or (2,5) or (3,4) or (4,3) or (5,2) or (6,1). These quantities of interest, or more formally, these real-valued functions defined on the sample space, are known as Random Variables. Since the value of a random variable is determined by the outcome of the experiment, we may assign probabilities to the possible values of the random variable. Example 2.1 Letting X denote the random variable that is defined as the sum of two fair dice; then P{ X = 2} = P{(1,1)} = 1 2 3 , P{ X = 3} = P{(1,2), (2,1)} = , P{ X = 4} = P{(1,3), (2,2), (3,1)} = , 36 36 36 P{ X = 5} = P{(1,4), (2,3), (3,2), (1,4)} = 4 5 , P{ X = 6} = P{(1,5), (2,4), (3,3), (4,2), (5,1)} = , 36 36 P{ X = 7} = P{(1,6), (2,5), (3,4), (4,3), (5,2), (6,1)} = P{ X = 8} = P{(2,6), (3,5), (4,4), (5,3), (6,2)} = P{X = 10} = P{(4,6), (5,5), (6,4)} = 6 , 36 5 4 , P{ X = 9} = P{(3,6), (4,5), (5,4), (6,3)} = , 36 36 3 2 1 , P{X = 11} = P{(5,6), (6,5)} = , P{X = 12} = P{(6,6)} = . 36 36 36 In other words, the random variable X can take on any integral value between two and twelve, and the probability that it takes on each value is given by Equation (2.1). Since X must take on one of values two through twelve, we must have that ⎧ 12 ⎫ 12 1 = ⎨U{ X = n}⎬ = ∑ P{ X = n} . ⎩ i =2 ⎭ n=2 which may be checked from Equation (2.1). ♦ Example 2.2 Suppose that our experiment consists of tossing two fair coins. Letting Y denote the number of heads appearing, then Y is a random variable taking on one of the values 0, 1, 2 with respective probabilities Dept. of Computer Engineering & Science – Dr. I-Shyan Hwang 1 Introduction to Stochastic Processess II: Random Variables P{Y = 0)} = P{(T , T )} = 1 1 1 , P{Y = 1)} = P{(T , H ), ( H , T )} = , P{Y = 2)} = P{( H , H )} = . 4 2 4 Of course, P{Y = 0)} + P{Y = 1} + P{Y = 2} = 1 . ♦ Example 2.3 Suppose that we toss a coin having a probability p of coming up heads, until the first head appears. Letting N denote the number of flips required, then assuming that the outcome of successive flips are independent, N is a random variable taking on one of the values 1, 2, 3, … , with respective probabilities: P{N = 1} = P{H } = p , P{N = 2} = P{(T , H )} = (1 − p) p , P{N = 3} = P{(T , T , H )} = (1 − p) 2 p , M P{N = n} = P{(T , T ,L, T , H )} = (1 − p ) n−1 p , n ≥ 1 . As a check, note that ∞ ⎧∞ ⎫ ∞ p P ⎨U{N = n}⎬ = ∑ P{N = n} = p ∑ (1 − p) n−1 = = 1. 1 − (1 − p) n=1 ⎩ n=1 ⎭ n=1 ♦ Example 2.4 Suppose that our experiment consists of seeing how long a battery can operate before wearing down. Suppose also that we are not primarily interested in the actual lifetime of the battery but are only concerned about whether or not the battery lasts at least two years. In this case, we may define the random variable I by ⎧1, I =⎨ ⎩ 0, if _ the _ lifetime _ of _ the _ battery _ is _ two _ or _ more _ years . otherwise If E denotes the event that the battery lasts two or more years, then the random variable I is known as the indicator random variable for event E . (Note that I equals 1 or 0 depending on whether or not E occurs.) ♦ Example 2.5 Suppose that independent trials, each of which results in any of m possible outcomes with respective probabilities p1 ,L, pm , ∑ m i =1 pi = 1 , are continually performed. Let X denote the number of trials needed until each outcome has occurred at least once. Rather than directly considering P{ X = n} we will first determine P{ X > n} , the probability that at least one of the outcomes has not yet occurred after n trials. Letting Ai denote the event that outcome i has not yet occurred after the first n trials, i = 1,L, m , then Dept. of Computer Engineering & Science – Dr. I-Shyan Hwang 2 Introduction to Stochastic Processess II: Random Variables ⎛m ⎞ m P{ X > n} = P⎜⎜ U Ai ⎟⎟ = ∑ P( Ai ) − ∑∑ P( Ai A j ) + ∑∑∑ P( Ai A j Ak ) − L + (−1) m+1 P( A1 L Am ) . i< j ⎝ i=1 ⎠ i=1 Now, P( Ai ) is the probability that each of the first n trials results in a non- i outcome, and so by independence P( Ai ) = (1 − pi ) n Similarly, P ( Ai A j ) is the probability that the first n trials all result in a non- i and non- j outcome, and so P( Ai A j ) = (1 − pi − p j ) n , As all of the other probabilities are similar, we see that m P{ X > n} = ∑ (1 − pi ) n − ∑∑ (1 − pi − p j ) n + ∑∑∑ (1 − pi − p j − pk ) n − L . i =1 i< j Since P ( X = n) = P{ X > n − 1} − P{ X > n} , we see, upon using the algebraic identity (1 − a ) n−1 − (1 − a) n = a(1 − a) n−1 , that m P{X = n} = ∑ pi (1 − pi )n−1 − ∑∑ ( pi + p j )(1 − pi − p j )n−1 + ∑ ∑∑ ( pi + p j + pk )(1 − pi − p j − pk )n−1 − L i=1 i< j i< j<k In all of the preceding examples, the random variables of interest took on either a finite or a countable number of possible values. Such random variables are called Discrete. However, there also exist random variables that take on a continuum of possible values. These are known as Continuous random variables. One example is the random variable denoting the lifetime of a car, when the car's lifetime is assumed to take on any value in some interval ( a , b) . The Cumulative distribution function (cdf) (or more simply the distribution function) F (•) of the X is defined for any real number b , − ∞ < b < ∞ , by F (b) = P{ X ≤ b} . In words, F (b) denotes the probability that the random variable X takes on a value which will be less than or equal to b . Some properties of the cdf F are: (i) F (b) is a nondecreasing function of b , (ii) limb→∞ F (b) = F (∞) = 1 , (iii) limb→−∞ F (b) = F (−∞) = 0 . Dept. of Computer Engineering & Science – Dr. I-Shyan Hwang 3 Introduction to Stochastic Processess II: Random Variables Property (i) follows since for a < b the event { X ≤ a} is contained in the event { X ≤ b} , and so it must have a smaller probability. Properties (ii) and (iii) follow since X must take on some finite value. All probability questions about X can be answered in terms of the cdf F (•) . For example, P{a < X ≤ b} = F (b) − F (a) for all a < b This follows since we may calculate P{a < X ≤ b} by first computing the probability that X ≤ b [that is, F (b) ] and then subtracting from this the probability that X ≤ a [that is, F (a) ]. If we desire the probability that X is strictly smaller than b , we may calculate this probability by P{ X < b} = lim+ P{ X ≤ b − h} = lim+ F (b − h) h→0 h→0 , where lim+ means that we are taking the limit as h decreases to 0. Note that P{ X < b} does h→0 not necessarily equal F (b) since F (b) also includes the probability that X equals b . Practice Problems 1. Suppose a die is rolled twice. What are the possible values that the following random variables can take on ? (i) The maximum value to appear in the two rolls. (ii) The minimum value to appear in the two rolls. (iii) The sum of the two rolls. (iv) The value of the first roll minus the value of the second roll. Ans: (i) 1, 2, 3, 4, 5, 6. (ii) 1, 2, 3, 4, 5, 6. (iii) 2, 3, 4, …. , 11, 12. (iv) –5, -4, … , 4, 5. 2. If the die in Question 1 is assumed fair, calculate the probabilities associated with the random variables in (i)-(iv). Ans: P{max=6}=11/36=P{min=1}. P{max=5}=1/4=P{min=2}. P{max=4}=7/36=P{min=3}. P{max=3}=5/36=P{min=4}. P{max=2}=1/12=P{min=5}. P{max=1}=1/36=P{min=6}. 2.2. Discrete Random Variables As was previously mentioned, a random variable that can take on at most a countable number of possible values is said to be Discrete. For a discrete random variable X , we define the probability mass function (pmf) p(a) of X by Dept. of Computer Engineering & Science – Dr. I-Shyan Hwang 4 Introduction to Stochastic Processess II: Random Variables p (a) = P{ X = a} . The probability mass function p(a) is positive for at most a countable number of values of a . That is, if X must assume one of the values x1 , x2 ,L, then p( xi ) > 0 , i = 1,2,L , and p( x) = 0 , for other values of x . ∞ Since X must take on one of the values xi , we have ∑ p( x ) = 1 . i =1 i The cumulative distribution function F can be expressed in terms of p(a) by ∑ p( x ) . F (a) = i all xi ≤a For instance, suppose X has a probability mass function given by p(1) = 1 1 1 , p (2) = , p (3) = , 2 3 6 Then, the cumulative distribution function F of X is given by ⎧ 0, ⎪1 ⎪⎪ , F (a) = ⎨ 2 5 ⎪ , ⎪6 ⎪⎩ 1 a <1 1≤ a < 2 2≤a<3 . 3≤ a Discrete random variables are often classified according to their probability mass function. We now consider some of these random variables. 2.2.1. The Bernoulli Random Variable Suppose that a trial, or an experiment, whose outcome can be classified as either a "success" or as a "failure" is performed. If we let X equal 1 if the outcome is a success and 0 if it is a failure, then the probability mass function of X is given by p(0) = P{ X = 0} = 1 − p and p (1) = P{ X = 1} = p , where (2.2) p , 0 < p < 1 , is the probability that the trial is a "success." A random variable X is said to be a Bernoulli random variable if its probability mass function is given by Equation (2.2) for some p ∈ (0,1) . 2.2.2. The Binomial Random Variable Dept. of Computer Engineering & Science – Dr. I-Shyan Hwang 5 Introduction to Stochastic Processess II: Random Variables Suppose that n independent trials, each of which results in a "success" with probability p and in a "failure" with probability 1 − p , are to be performed. If X represents the number of successes that occur in the n trials, then X is said to be a binomial random variable with parameters (n, p) . The probability mass function of a binomial random variable having parameters (n, p) is given by p(i ) = Cin p i (1 − p) n−i , i = 0,1,L, n . , where Cin = (2.3) n! equals the number of different groups of i objects that can be chosen (n − i )!i! from a set of n objects. The validity of Equation (2.3) may be verified by first noting that the probability of any particular sequence of the n outcomes containing i successes and n − i failures is, by the assumed independence of trials, p i (1 − p) n−i . Equation (2.3) then follows since there are Cin different sequences of the n outcomes leading to i successes and n − i failures. For instance, if n = 3 , i = 2 , then there are C23 = 3 ways in which the three trials can result in two successes. Note that, by the binomial theorem, the probabilities sum to one, that is, ∞ n i =0 i =0 ∑ p(i) = ∑ Cin pi (1 − p)n−i = ( p + 1 − p))n = 1 . Example 2.6 Four fair coins are flipped. If the outcomes are assumed independent, what is the probability that two heads and two tails are obtained? Solution: Letting X equal the number of heads ("successes") that appear, then X is a 1⎞ ⎛ binomial random variable with parameters ⎜ n = 4, p = ⎟ . Hence, by Equation (2.3), 2⎠ ⎝ 2 ⎛1⎞ ⎛ 1⎞ P{ X = a} = C ⎜ ⎟ ⎜1 − ⎟ ⎝ 2⎠ ⎝ 2⎠ 4 2 4− 2 3 = . ♦ 8 Example 2.7 It is known that all items produced by a certain machine will be defective with probability 0.1, independently of each other. What is the probability that in a sample of three items, at most one will be defective? Solution: if X is the number of defective items in the sample, then X is a binomial random variable with parameters (n = 3, p = 0.1) . Hence, the desired probability is given by Dept. of Computer Engineering & Science – Dr. I-Shyan Hwang 6 Introduction to Stochastic Processess II: Random Variables P{ X = 0} + P{ X = 1} = C 03 ( 0 .1) 0 ( 0 .9 ) 3 + C13 ( 0 .1)1 ( 0 .9 ) 2 = 0 .972 . ♦ Example 2-8 Suppose that an airplane engine will fail, when in flight, with probability 1 − p independently from engine to engine; suppose that the airplane will make a successful flight if at least 50 percent of its engines remain operative. For what values of p is a four-engine plane referable to a two-engine plane? Solution: Because each engine is assumed to fail or function independently of what happens with the other engines, it follows that the number of engines remaining operative is a binomial random variable. Hence, the probability that a four-engine plane makes a successful flight is C 24 p 2 (1 − p ) 2 + C 34 p 3 (1 − p )1 + C 44 p 4 (1 − p ) 0 = 6 p 2 (1 − p ) 2 + 4 p 3 (1 − p ) + p 4 , whereas the corresponding probability for a two-engine plane is C12 p (1 − p ) + C 22 p 2 = 2 p (1 − p ) + p 2 . Hence the four-engine plane is safer if 6 p 2 (1 − p ) 2 + 4 p 3 (1 − p ) + p 4 ≥ 2 p (1 − p ) + p 2 which simplifies to 3 p3 − 8 p2 + 7 p − 2 ≥ 0 or (1 − p ) 2 (3 p − 2 ) ≥ 0 , which is equivalent to (3 p − 2 ) ≥ 0 or p≥ 2 . 3 Hence, the four-engine plane is safer when the engine success probability is at least as large as 2 2 , whereas the two-engine plane is safer when this probability falls below . ♦ 3 3 Example 2.9 Suppose that a particular trait of a person (such as eye color or left handedness) is classified on the basis of one pair of genes and suppose that d represents a dominant gene and r a recessive gene. Thus a person with dd genes is pure dominance, one with rr is pure recessive, and one with rd is hybrid. The pure dominance and the hybrid are alike in appearance. Children receive one gene from each parent. If, with respect to a particular trait, two hybrid parents have a total of four children, what is the probability that exactly three of the four children have the outward appearance of the dominant gene? Solution: If we assume that each child is equally likely to inherit either of two genes from each parent, the probabilities that the child of two hybrid parents will have dd , rr , or rd pairs of genes are, respectively, 1 1 1 , , . Hence, because an offspring will have the outward 4 4 2 Dept. of Computer Engineering & Science – Dr. I-Shyan Hwang 7 Introduction to Stochastic Processess II: Random Variables appearance of the dominant gene if its gene pair is either dd or rd , it follows that the ⎛ number of such children is binomially distributed with parameters ⎜ n = 4, p = ⎝ 3 ⎛3⎞ ⎛ 1⎞ desired probability is C ⎜ ⎟ ⎜1 − ⎟ ⎝4⎠ ⎝ 4⎠ 4−3 4 3 = 27 . 64 3⎞ ⎟ . Thus the 4⎠ ♦ Example 2.10 Now consider the logical link control (LLC) and medium access control (MAC) protocol of a wireless communication system. When an LLC frame is passed to the MAC layer, it is segmented in to n MAC blocks of fixed size and these n blocks are transmitted through the radio channel separately. Assume that the automatic repeat request (ARQ) scheme is applied in case an error occurred during the transmission. Let Pc (k ) denote the probability that after the (k − 1) st MAC retransmission there are MAC blocks in error that are corrected by the k th MAC retransmission. We are interested in the pmf of K , which is the number of LLC transmission required for the error free transmission of n MAC blocks. Solution: Assume that the probability of successful transmission of a single block is p(> 0) . Then the probability Pc (1) that all n MAC blocks of an LLC frame are received error-free at the first transmission is equal to Pc (1) = p n , where we assume that the transmission of MAC blocks are statistically independent events. To calculate Pc (2) , we note that 1 − (1 − p ) 2 is the probability that a given MAC block is successfully received after two MAC retransmissions. Then, (1 − (1 − p) 2 ) n is the probability that the LLC frame is correctly received within one or two transmissions. This yields Pc (2) = (1 − (1 − p) 2 ) n − p n . Following the above approach, the general form of Pc (k ) is Pc (k ) = (1 − (1 − p) k ) n − (1 − (1 − p) k −1 ) n . From above equation we have lim Pc (k ) = lim{(1 − (1 − p ) k ) n − (1 − (1 − p ) k −1 ) n } = 0 k →∞ k →∞ and ∞ ∑{(1 − (1 − p) ) − (1 − (1 − p) k −1 ) n } = 1 . ♦ k n k =1 Dept. of Computer Engineering & Science – Dr. I-Shyan Hwang 8 Introduction to Stochastic Processess II: Random Variables Remark on Terminology If X is a binomial random variable with parameters (n, p) , then we say that X has a binomial distribution with parameters (n, p) . 2.2.3. The Geometric Random Variable Suppose that independent trials, each having a probability p of being a success, are performed until a success occurs. If we let X be the number of trials required until the first success, then X is said to be a geometric random variable with parameter p . Its probability mass function is given by p (n) = P{ X = n} = (1 − p) n−1 p , n = 1,2,L . (2.4) Equation (2.4) follows since in order for X to equal n it is necessary and sufficient that the first n − 1 trials be failures and the nth trial a success. Equation (2.4) follows since the outcomes of the successive trials are assumed to be independent. To check that p(n) is a probability mass function, we note that ∞ ∞ n =1 1 ∑ p(n) = p∑ (1 − p)n−1 = 1 . 2.2.4. The Poisson Random Variable A random variable X , taking on one of the values 0, 1, 2, … , is said to be a Poisson random variable with parameter if for some λ > 0 , p(i ) = P{ X = i} = e −λ λi i! . (2.5) Equation (2.5) defines a probability mass function since ∞ ∞ λi i =0 i! ∑ p (i ) = e − λ ∑ n =0 = e −λ ⋅ e λ = 1 . The Poisson random variable has a wide range of applications in a diverse number of areas, as will be seen in Chapter 5. An important property of the Poisson random variable is that it may be used to approximate a binomial random variable when the binomial parameter n is large and p is small. To see this, suppose that X is a binomial random variable with parameters (n, p) , and let λ = np . Then Dept. of Computer Engineering & Science – Dr. I-Shyan Hwang 9 Introduction to Stochastic Processess II: Random Variables n! n! ⎛ λ ⎞ ⎛ λ ⎞ P( X = i) = = p i (1 − p) n−i = ⎜ ⎟ ⎜1 − ⎟ (n − i )!i! (n − i )!i! ⎝ n ⎠ ⎝ n ⎠ i n −i λ (1 − ) n n(n − 1)L(n − i + 1) λi n . = ni i! (1 − λ ) i n Now, for n large and p small n(n − 1) L (n − i + 1) ⎛ λ⎞ ⎛ λ⎞ −λ ≈ 1 , ⎜1 − ⎟ ≈ 1 . ⎜1 − ⎟ ≈ e , i n ⎝ n⎠ ⎝ n⎠ n i Hence, for n large and p small, P ( X = i ) ≈ e −λ λi i! . Example 2.11 Suppose that the number of typographical errors on a single page of this book has a Poisson distribution with parameter λ = 1 . Calculate the probability that there is at least one error on this page. Solution: P{ X ≥ 1} = 1 − P{ X = 0} = 1 − e −1 ≈ 0.633 . ♦ Example 2.12 If the number of accidents occurring on a highway each day is a Poisson random variable with parameter λ = 3 , what is the probability that no accidents occur today? P{ X = 0} = 1 − e −3 ≈ 0.05 . Solution: ♦ Example 2.13 Consider an experiment that consists of counting the number of α -particles given off in a one-second interval by one gram of radioactive material. If we know from past experience that, on the average, 3.2 such α -particles are given off, what is a good approximation to the probability that no more than 2 α -particles will appear? Solution: If we think of the gram of radioactive material as consisting of a large number n of atoms each of which has probability 3.2 / n of disintegrating and sending off an α -particle during the second considered, then we see that, to a very close approximation, the number of α -particles given off will be a Poison random variable with parameter λ = 3.2 . Hence the desired probability is P{ X ≤ 2} = e −3.2 + 3.2e −3.2 + (3.2) 2 ≈ 0.382 . 2! ♦ Practice Problems Dept. of Computer Engineering & Science – Dr. I-Shyan Hwang 10 Introduction to Stochastic Processess II: Random Variables 1. A ball is drawn from an urn containing three white and three black balls. After the ball is drawn, it is then replaced and another ball is drawn. This goes on indefinitely. What is the probability that of the first four balls drawn, exactly two are white ? ⎛ 4 ⎞⎛ 1 ⎞ Ans: ⎜⎜ ⎟⎟⎜ ⎟ ⎝ 2 ⎠⎝ 2 ⎠ 2 2 3 ⎛1⎞ ⎜ ⎟ = . 8 ⎝2⎠ 2. Let X be binomially distributed with parameters n and p . Show that as k goes from 0 to n , P{ X = k )...
View Full Document

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture