This preview shows page 1. Sign up to view the full content.
Unformatted text preview: I nstructor's Solutions Manual Third Edition Fundamentals of ProbabilitY With Stochastic Processes SAEED GHAHRAMANI
Western New England College Upper Saddle River, New Jersey 07458 C ontents
1
1.2
1.4
1.7 1 Sample Space and Events
1
Basic Theorems
2
Random Selection of Points from Intervals
Review Problems
9 2
2.2
2.3
2.4
2.5 Axioms of Probability 7 Combinatorial Methods 13 Counting Principle
13
Permutations
16
Combinations
18
Stirling’ Formula
31
Review Problems
31 3 Conditional Probability and Independence 3.1 Conditional Probability
35
3.2 Law of Multiplication
39
3.3 Law of Total Probability
41
3.4 Bayes’ Formula
46
3.5 Independence
48
3.6 Applications of Probability to Genetics
Review Problems
59 4 35 56 Distribution Functions and
Discrete Random Variables 4.2 Distribution Functions
63
4.3 Discrete Random Variables
66
4.4 Expectations of Discrete Random Variables
71
4.5 Variances and Moments of Discrete Random Variables
4.6 Standardized Random Variables
83
Review Problems
83 63 77 iv Contents 5
5.1
5.2
5.3 111 Special Continuous Distributions 113 126 152 Bivariate Distributions Joint Distribution of Two Random Variables
Independent Random Variables
166
Conditional Distributions
174
Transformations of Two Random Variables
Review Problems
191 9
9.1
9.2
9.3 Continuous Random Variables Uniform Random Variable
126
Normal Random Variable
131
Exponential Random Variables
139
Gamma Distribution
144
Beta Distribution
147
Survival Analysis and Hazard Function
Review Problems
153 8
8.1
8.2
8.3
8.4 87 Probability Density Functions
111
Density Function of a Function of a Random Variable
Expectations and Variances
116
Review Problems
123 7
7.1
7.2
7.3
7.4
7.5
7.6 87 Bernoulli and Binomial Random Variables
Poisson Random Variable
94
Other Discrete Random Variables
99
Review Problems
106 6
6.1
6.2
6.3 Special Discrete Distributions 157
157 183 Multivariate Distributions Joint Distribution of n > 2 Random Variables
Order Statistics
210
Multinomial Distributions
215
Review Problems
218 200
200 Contents 10
10.1
10.2
10.3
10.4
10.5 Expected Values of Sums of Random Variables
Covariance
227
Correlation
237
Conditioning on Random Variables
239
Bivariate Normal Distribution
251
Review Problems
254 11
11.1
11.2
11.3
11.4
11.5 Sums of Independent Random
Variables and Limit Theorems 222
222 261 MomentGenerating Functions
261
Sums of Independent Random Variables
269
Markov and Chebyshev Inequalities
274
Laws of Large Numbers
278
Central Limit Theorem
282
Review Problems
287 12
12.2
12.3
12.4
12.5 More Expectations and Variances v 291 Stochastic Processes More on Poisson Processes
291
Markov Chains
296
ContinuousTime Markov Chains
Brownian Motion
326
Review Problems
331 315 Chapter 1 A xioms
1.2 of Probability SAMPLE SPACE AND EVENTS 1. For 1 ≤ i, j ≤ 3, by (i, j ) we mean that Vann’s card number is i , and Paul’s card number is
j . Clearly, A = (1, 2), (1, 3), (2, 3) and B = (2, 1), (3, 1), (3, 2) .
(a) Since A ∩ B = ∅, the events A and B are mutually exclusive.
(b) None of (1, 1), (2, 2), (3, 3) belongs to A ∪ B . Hence A ∪ B not being the sample space
shows that A and B are not complements of one another. 2. S = {RRR, RRB, RBR, RBB, BRR, BRB, BBR, BBB }.
3. {x : 0 < x < 20}; {1, 2, 3, . . . , 19}.
4. Denote the dictionaries by d1 , d2 ; the third book by a . The answers are
{d1 d2 a, d1 ad2 , d2 d1 a, d2 ad1 , ad1 d2 , ad2 d1 } and {d1 d2 a, ad1 d2 }. 5. EF : One 1 and one even.
E c F : One 1 and one odd.
E c F c : Both even or both belong to {3, 5}. 6. S = {QQ, QN, QP , QD, DN, DP , N P , N N, P P }. (a) {QP }; (b) {DN, DP , NN }; (c) ∅.
1
3
1
3
7. S = x : 7 ≤ x ≤ 9 1 ; x : 7 ≤ x ≤ 7 4 ∪ x : 7 4 ≤ x ≤ 8 4 ∪ x : 8 4 ≤ x ≤ 9 1 .
6
6 8. E ∪ F ∪ G = G: If E or F occurs, then G occurs.
EF G = G: If G occurs, then E and F occur. 9. For 1 ≤ i ≤ 3, 1 ≤ j ≤ 3, by ai bj we mean passenger a gets off at hotel i and passenger b gets off at hotel j . The answers are {ai bj : 1 ≤ i ≤ 3, 1 ≤ j ≤ 3} and {a1 b1 , a2 b2 , a3 b3 },
respectively. 10. (a) (E ∪ F )(F ∪ G) = (F ∪ E)(F ∪ G) = F ∪ EG. 2 Chapter 1 (b) Axioms of Probability Using part (a), we have (E ∪ F )(E c ∪ F )(E ∪ F c ) = (F ∪ EE c )(E ∪ F c ) = F (E ∪ F c ) = F E ∪ F F c = F E. 11. (a) AB c C c ; (b) A ∪ B ∪ C ; (e) AB c C c ∪ Ac B c C ∪ Ac BC c ; (c) Ac B c C c ; (d) ABC c ∪ AB c C ∪ Ac BC ; (f) (A − B) ∪ (B − A) = (A ∪ B) − AB. 12. If B = ∅, the relation is obvious. If the relation is true for every event A, then it is true for S ,
the sample space, as well. Thus
S = (B ∩ S c ) ∪ (B c ∩ S) = ∅ ∪ B c = B c ,
showing that B = ∅. 13. Parts (a) and (d) are obviously true; part (c) is true by DeMorgan’s law; part (b) is false: throw
a foursided die; let F = {1, 2, 3}, G = {2, 3, 4}, E = {1, 4}. 14. (a) ∞
n=1 An ; (b) 37
n=1 An . 15. Straightforward.
16. Straightforward.
17. Straightforward.
18. Let a1 , a2 , and a3 be the ﬁrst, the second, and the third volumes of the dictionary. Let a4 , a5 ,
a6 , and a7 be the remaining books. Let A = {a1 , a2 , . . . , a7 }; the answers are S = x1 x2 x3 x4 x5 x6 x7 : xi ∈ A, 1 ≤ i ≤ 7, and xi = xj if i = j
and
x1 x2 x3 x4 x5 x6 x7 ∈ S : xi xi +1 xi +2 = a1 a2 a3 for some i , 1 ≤ i ≤ 5 ,
respectively. 19. ∞
m=1 ∞
n=m An . 20. Let B1 = A1 , B2 = A2 − A1 , B3 = A3 − (A1 ∪ A2 ), . . . , Bn = An − 1.4 BASIC THEOREMS 1. No; P (sum 11) = 2/36 while P (sum 12) = 1/36.
2. 0.33 + 0.07 = 0.40. n−1
i =1 Ai , . . . . Section 1.4 Basic Theorems 3 3. Let E be the event that an earthquake will damage the structure next year. Let H be the
event that a hurricane will damage the structure next year. We are given that P (E) = 0.015,
P (H ) = 0.025, and P (EH ) = 0.0073. Since
P (E ∪ H ) = P (E) + P (H ) − P (EH ) = 0.015 + 0.025 − 0.0073 = 0.0327,
the probability that next year the structure will be damaged by an earthquake and/or a hurricane
is 0.0327. The probability that it is not damaged by any of the two natural disasters is 0.9673. 4. Let A be the event of a randomly selected driver having an accident during the next 12 months.
Let B be the event that the person is male. By Theorem 1.7, the desired probability is
P (A) = P (AB) + P (AB c ) = 0.12 + 0.06 = 0.18. 5. Let A be the event that a randomly selected investor invests in traditional annuities. Let B be
the event that he or she invests in the stock market. Then P (A) = 0.75, P (B) = 0.45, and
P (A ∪ B) = 0.85. Since,
P (AB) = P (A) + P (B) − P (A ∪ B) = 0.75 + 0.45 − 0.85 = 0.35,
35% invest in both stock market and traditional annuities. 6. The probability that the ﬁrst horse wins is 2/7. The probability that the second horse wins
is 3/10. Since the events that the ﬁrst horse wins and the second horse wins are mutually
exclusive, the probability that either the ﬁrst horse or the second horse will win is
2
3
41
+
=
.
7 10
70 7. In point of fact Rockford was right the ﬁrst time. The reporter is assuming that both autopsies
are performed by a given doctor. The probability that both autopsies are performed by the same
doctor–whichever doctor it may be–is 1/2. Let AB represent the case in which Dr. A performs
the ﬁrst autopsy and Dr. B performs the second autopsy, with similar representations for other
cases. Then the sample space is S = {AA, AB, BA, BB }. The event that both autopsies are
performed by the same doctor is {AA, BB }. Clearly, the probability of this event is 2/4=1/2. 8. Let m be the probability that Marty will be hired. Then m + (m + 0.2) + m = 1 which gives
m = 8/30; so the answer is 8/30 + 2/10 = 7/15. 9. Let s be the probability that the patient selected at random suffers from schizophrenia. Then
s + s/3 + s/2 + s/10 = 1 which gives s = 15/29. 10. P (A ∪ B) ≤ 1 implies that P (A) + P (B) − P (AB) ≤ 1.
11. (a) 2/52 + 2/52 = 1/13; (b) 12/52 + 26/52 − 6/53 = 8/13; (c) 1 − (16/52) = 9/13. 4 Chapter 1 Axioms of Probability 12. (a) False; toss a die and let A = {1, 2}, B = {2, 3}, and C = {1, 3}.
False; toss a die and let A = {1, 2, 3, 4}, B = {1, 2, 3, 4, 5}, C = {1, 2, 3, 4, 5, 6}. (b) 13. A simple Venn diagram shows that the answers are 65% and 10%, respectively.
14. Applying Theorem 1.6 twice, we have
P (A ∪ B ∪ C) = P (A ∪ B) + P (C) − P (A ∪ B)C
= P (A) + P (B) − P (AB) + P (C) − P (AC ∪ BC)
= P (A) + P (B) − P (AB) + P (C) − P (AC) − P (BC) + P (ABC)
= P (A) + P (B) + P (C) − P (AB) − P (AC) − P (BC) + P (ABC). 15. Using Theorem 1.5, we have that the desired probability is
P (AB − ABC) + P (AC − ABC) + P (BC − ABC)
= P (AB) − P (ABC) + P (AC) − P (ABC) + P (BC) − P (ABC)
= P (AB) + P (AC) + P (BC) − 3P (ABC). 16. 7/11.
17. n
i =1 pij . 18. Let M and F denote the events that the randomly selected student earned an A on the midterm
exam and an A on the ﬁnal exam, respectively. Then
P (MF ) = P (M) + P (F ) − P (M ∪ F ),
where P (M) = 17/33, P (F ) = 14/33, and by DeMorgan’s law,
P (M ∪ F ) = 1 − P (M c F c ) = 1 −
Therefore,
P (MF ) = 22
11
=
.
33
33 17 14 22
3
+
−
=.
33 33 33
11 19. A Venn diagram shows that the answers are 1/8, 5/24, and 5/24, respectively.
20. The equation has real roots if and only if b2 ≥ 4c. From the 36 possible outcomes for (b, c), in the following 19 cases we have that b2 ≥ 4c: (2, 1), (3, 1), (3, 2), (4, 1), . . . , (4, 4), (5, 1),
. . . , (5, 6), (6, 1), . . . , (6, 6). Therefore, the answer is 19/36. 21. The only prime divisors of 63 are 3 and 7. Thus the number selected is relatively prime to 63
if and only if it is neither divisible by 3 nor by 7. Let A and B be the events that the outcome Section 1.4 Basic Theorems 5 is divisible by 3 and 7, respectively. The desired quantity is
P (Ac B c ) = 1 − P (A ∪ B) = 1 − P (A) − P (B) + P (AB)
=1− 21
9
3
4
−
+
=.
63 63 63
7 22. Let T and F be the events that the number selected is divisible by 3 and 5, respectively.
(a) The desired quantity is the probability of the event T F c :
P (T F c ) = P (T ) − P (T F ) = 333
66
267
−
=
.
1000 1000
1000 (b) The desired quantity is the probability of the event T c F c :
P (T c F c ) = 1 − P (T ∪ F ) = 1 − P (T ) − P (F ) + P (T F )
=1− 200
66
533
333
−
+
=
.
1000 1000 1000
1000 23. (Draw a Venn diagram.) From the data we have that 55% passed all three, 5% passed calculus
and physics but not chemistry, and 20% passed calculus and chemistry but not physics. So at
least (55 + 5 + 20)% = 80% must have passed calculus. This number is greater than the given
78% for all of the students who passed calculus. Therefore, the data is incorrect. 24. By symmetry the answer is 1/4.
25. Let A, B , and C be the events that the number selected is divisible by 4, 5, and 7, respectively.
We are interested in P (AB c C c ). Now AB c C c = A − A(B ∪ C) and A(B ∪ C) ⊆ A. So by
Theorem 1.5,
P (AB c C c ) = P (A) − P A(B ∪ C) = P (A) − P (AB ∪ AC)
= P (A) − P (AB) − P (AC) + P (ABC)
= 50
35
7
172
250
−
−
+
=
.
1000 1000 1000 1000
1000 26. A Venn diagram shows that the answer is 0.36.
27. Let A be the event that the ﬁrst number selected is greater than the second; let B be the
event that the second number selected is greater than the ﬁrst; and let C be the event that
the two numbers selected are equal. Then P (A) + P (B) + P (C) = 1, P (A) = P (B), and
P (C) = 1/100. These give P (A) = 99/200. 28. Let B1 = A1 , and for n ≥ 2, Bn = An −
mutually exclusive events and ∞
i =1 Ai = ∞
i =1 n−1
i =1 Ai . Then {B1 , B2 , . . . } is a sequence of
Bi . Hence 6 Chapter 1 Axioms of Probability ∞ ∞ An = P P
n=1 ∞ Bn =
n=1 ∞ P (Bn ) ≤
n=1 P (An ),
n=1 since Bn ⊆ An , n ≥ 1. 29. By Boole’s inequality (Exercise 28),
∞ ∞ ∞ An = 1 − P P
n=1 Ac ≥ 1 −
n
n=1 P (Ac ).
n
n=1 30. She is wrong! Consider the next 50 ﬂights. For 1 ≤ i ≤ 50, let Ai be the event that the i th
mission will be completed without mishap. Then 50
i =1 Ai is the event that all of the next 50 50
missions will be completed successfully. We will show that P
i =1 Ai > 0. This proves
that Mia is wrong. Note that the probability of the simultaneous occurrence of any number of
Ac ’s is nonzero. Furthermore, consider any set E consisting of n (n ≤ 50) of the Ac ’s. It is
i
i
reasonable to assume that the probability of the simultaneous occurrence of the events of E is
strictly less than the probability of the simultaneous occurrence of the events of any subset of
E . Using these facts, it is straightforward to conclude from the inclusion–exclusion principle
that,
50
50
50
1
P
Ac <
P (Ac ) =
= 1.
i
i
50
i =1
i =1
i =1 Thus, by DeMorgan’s law,
50 50 Ai = 1 − P P
i =1 Ac > 1 − 1 = 0.
i
i =1 31. Q satisﬁes Axioms 1 and 2, but not necessarily Axiom 3. So it is not, in general, a probability
on S . Let S = {1, 2, 3, }. Let P {1} = P {2} = P {3} = 1/3. Then Q {1} = Q {2} =
2
1/9, whereas Q {1, 2} = P {1, 2} = 4/9. Therefore,
Q {1, 2, } = Q {1} + Q {2} .
R is not a probability on S because it does not satisfy Axiom 2; that is, R(S) = 1. 32. Let BRB mean that a blue hat is placed on the ﬁrst player’s head, a red hat on the second
player’s head, and a blue hat on the third player’s head, with similar representations for other
cases. The sample space is
S = {BBB, BRB, BBR, BRR, RRR, RRB, RBR, RBB }.
This shows that the probability that two of the players will have hats of the same color and
the third player’s hat will be of the opposite color is 6/8 = 3/4. The following improvement, Section 1.7 Random Selection of Points from Intervals 7 based on this observation, explained by Sara Robinson in Tuesday, April 10, 2001 issue of
the New York Times, is due to Professor Elwyn Berlekamp of the University of California at
Berkeley.
Threefourths of the time, two of the players will have hats of the same color and
the third player’s hat will be the opposite color. The group can win every time this
happens by using the following strategy: Once the game starts, each player looks
at the other two players’ hats. If the two hats are different colors, he [or she] passes.
If they are the same color, the player guesses his [or her] own hat is the opposite
color. This way, every time the hat colors are distributed two and one, one player
will guess correctly and the others will pass, and the group will win the game. When
all the hats are the same color, however, all three players will guess incorrectly and
the group will lose. 1.7 RANDOM SELECTION OF POINTS FROM INTERVALS 1. 30 − 10
2
=.
30 − 0
3 2. 0.0635 − 0.04
= 0.294.
0.12 − 0.04 3. (a) False; in the experiment of choosing a point at random from the interval (0, 1), let
A = (0, 1) − {1/2}. A is not the sample space but P (A) = 1.
1
(b) False; in the same experiment P {1/2} = 0 while { 2 } = ∅. 4. P (A ∪ B) ≥ P (A) = 1, so P (A ∪ B) = 1. This gives
P (AB) = P (A) + P (B) − P (A ∪ B) = 1 + 1 − 1 = 1. 5. The answer is 1999 P {1, 2, . . . , 1999} = 1999 P {i } =
i =1 0 = 0.
i =1 6. For i = 0, 1, 2, . . . , 9, the probability that i appears as the ﬁrst digit of the decimal represeni i+1
tation of the selected point is the probability that the point falls into the interval
,
.
10 10
Therefore, it equals
i+1
i
−
1
10
10
.
=
10
1−0
This shows that all numerals are equally likely to appear as the ﬁrst digit of the decimal
representation of the selected point. 8 Chapter 1 Axioms of Probability 7. No, it is not. Let S = {w1 , w2 , . . . }. Suppose that for some p > 0, P {wi } = p, i = 1, 2,
. . . . Then, by Axioms 2 and 3, ∞
i =1 p = 1. This is impossible. 8. Use induction. For n = 1, the theorem is trivial. Exercise 4 proves the theorem for n = 2.
Suppose that the theorem is true for n. We show it for n + 1, P (A1 A2 · · · An An+1 ) = P (A1 A2 · · · An ) + P (An+1 ) − P (A1 A2 · · · An ∪ An+1 )
= 1 + 1 − 1 = 1,
where P (A1 A2 · · · An ) = 1 is true by the induction hypothesis, and
P (A1 A2 · · · An ∪ An+1 ) ≥ P (An+1 ) = 1,
implies that P (A1 A2 · · · An ∪ An+1 ) = 1.
∞ ∞ 1
1
1
11
1
11
1
9. (a) Clearly, ∈
−
,+
. If x ∈
−
,+
, then, for all n ≥ 1,
2 n=1 2 2n 2 2n
2 2n 2 2n
n=1
1
1
1
1
−
<x< + .
2 2n
2 2n
Letting n → ∞, we obtain 1/2 ≤ x ≤ 1/2; thus x = 1/2.
(b) Let An be the event that the point selected at random is in 11
1
1
− ,+
; then
2 2n 2 2n A1 ⊇ A2 ⊇ A3 ⊇ · · · ⊇ An ⊇ An+1 ⊇ · · · .
Since P (An ) = 1
, by the continuity property of the probability function,
n
P {1/2} = lim P (An ) = 0.
n→∞ 10. The set of rational numbers is countable. Let Q = {r1 , r2 , r3 , . . . } be the set of rational
numbers in (0, 1). Then
∞ P (Q) = P {r1 , r2 , r3 , . . . } = P {ri } = 0.
i =1 Let I be the set of irrational numbers in (0, 1); then
P (I) = P (Qc ) = 1 − P (Q) = 1. 11. For i = 0, 1, 2, . . . , 9, the probability that i appears as the nth digit of the decimal representation of the selected point is the probability that the point falls into the following subset of
(0, 1):
10n−1 −1
m=0 10m + i 10m + i + 1
,
.
10n
10n Chapter 1 Review Problems 9 Since the intervals in this union are mutually exclusive, the probability that the point falls into
this subset is
10m + i + 1 10m + i
−
10n−1 −1
1
1
10n
10n
.
= 10n−1 · n =
10
10
1−0
m=0
This shows that all numerals are equally likely to appear as the nth digit of the decimal
representation of the selected point. 12. P (Bm ) ≤ ∞
n=m ∞
n=1 P (An ). Since P (An ) converges,
∞ lim P (Bm ) ≤ lim m→∞ m→∞ P (An ) = 0.
n=m This gives limm→∞ P (Bm ) = 0. Therefore,
B1 ⊇ B2 ⊇ B3 ⊇ · · · ⊇ Bm ⊇ Bm+1 ⊇ · · ·
implies that ∞ ∞ ∞ An = P P Bm = lim P (Bm ) = 0. m=1 n=m m=1 m→∞ 13. In the experiment of choosing a random point from (0, 1), let Et = (0, 1) − {t }, for 0 < t < 1.
Then P (Et ) = 1 for all t , while Et = P (∅) = 0. P
t ∈(0,1) 14. Clearly rn ∈ (αn , βn ). By the geometric series theorem,
∞ ∞ (βn − αn ) =
n=1 n=1 ε
=ε
2n+1 1
4
1
1−
2 = ε
< ε.
2 REVIEW PROBLEMS FOR CHAPTER 1
1. 3.25 − 2
= 0.54.
4.3 − 2 2. We have that
S= ∅, {1} , ∅, {2} , ∅, {1, 2} , {1}, {2} , {1}, {1, 2} , {2}, {1, 2} . 10 Chapter 1 Axioms of Probability The desired events are
(a) ∅, {1} , ∅, {2} , ∅, {1, 2} , {1}, {2} ; (c) ∅, {1} , ∅, {2} , ∅, {1, 2} , {1}, {1, 2} , {2}, {1, 2} . (b) ∅, {1, 2} , {1}, {2} ; 3. Since A ⊆ B , we have that B c ⊆ Ac . This implies that (a) is false but (b) is true.
4. In the experiment of tossing a die let A = {1, 3, 5} and B = {5}; then both (a) and (b) are
false. 5. We may deﬁne a sample space S as follows.
S = x1 x2 · · · xn : n ≥ 1, xi ∈ {H,T}; xi = xi +1 , 1 ≤ i ≤ n − 2; xn−1 = xn . 6. A venn diagram shows that 18 are neither male nor for surgery.
7. We have that ABC ⊆ BC , so P (ABC) ≤ P (BC) and hence P (BC) − P (ABC) ≥ 0. This
and the following give the result.
P (A ∪ B ∪ C) = P (A) + P (B) + P (C) − P (AB) + P (AC) + P (BC) − P (ABC)
≤ P (A) + P (B) + P (C). 8. If P (AB) = P (AC) = P (BC) = 0, then P (ABC) = 0 since ABC ⊆ AB. These imply that
P (A ∪ B ∪ C) = P (A) + P (B) + P (C) − P (AB) − P (AC) − P (BC) + P (ABC)
= P (A) + P (B) + P (C).
Now suppose that
P (A ∪ B ∪ C) = P (A) + P (B) + P (C).
This relation implies that
P (AB) + P (BC) + P (AC) − P (ABC) = 0. (1) Since P (AC) − P (ABC) ≥ 0 we have that the sum of three nonnegative quantities is 0; so
each of them is 0. That is,
P (AB) = 0, P (BC) = 0, P (AC) = P (ABC). (2) Now rewriting (1) as
P (AB) + P (AC) + P (BC) − P (ABC) = 0,
the same argument implies that
P (AB) = 0, P (AC) = 0, P (BC) = P (ABC). Comparing (2) and (3) we have
P (AB) = P (AC) = P (BC) = 0. (3) Chapter 1 Review Problems 11 9. Let W be the event that a randomly selected person from this community drinks or serves
white wine. Let R be the event that she or he drinks or serves red wine. We are given that
P (W ) = 0.40, P (R) = 0.50, and P (W ∪ R) = 0.70. Since
P (W R) = P (W ) + P (R) − P (W ∪ R) = 0.40 + 0.50 − 0.70 = 0.20,
20% percent drink or serve both red and white wine. 10. No, it is not right. The probability that the second student chooses the tire the ﬁrst student
chose is 1/4. 11. By De Morgan’s second law,
P (Ac B c ) = 1 − P (Ac B c )c = 1 − P (A ∪ B) = 1 − P (A) − P (B) + P (AB). 12. By Theorem 1.5 and the fact that A − B and B − A are mutually exclusive,
P (A − B) ∪ (B − A) = P (A − B) + P (B − A) = P (A − AB) + P (B − AB)
= P (A) − P (AB) + P (B) − P (AB) = P (A) + P (B) − 2P (AB). 13. Denote a box of books by ai , if it is received from publisher i , i = 1, 2, 3. The sample space
is
S = x1 x2 x3 x4 x5 x6 : two of the xi ’s are a1 , two of them are a2 , and the remaining two are a3 . The desired event is E = x1 x2 x3 x4 x5 x6 ∈ S : x5 = x6 . 14. Let E , F , G, and H be the events that the next baby born in this town has blood type O, A, B,
and AB, respectively. Then
P (E) = P (F ), P (G) = 1
P (F ), P (G) = 2P (H ).
10 These imply
P (E) = P (F ) = 20P (H ).
Therefore, from
P (E) + P (F ) + P (G) + P (H ) = 1,
we get
20P (H ) + 20P (H ) + 2P (H ) + P (H ) = 1,
which gives P (H ) = 1/43. 15. Let F , S , and N be the events that the number selected is divisible by 4, 7, and 9, respectively.
We are interested in P (F c S c N c ) which is equal to 1 − P (F ∪ S ∪ N) by DeMorgan’s law. 12 Chapter 1 Axioms of Probability Now
P (F ∪ S ∪ N) = P (F ) + P (S) + P (N ) − P (F S) − P (F N ) − P (SN ) + P (F SN )
= 142
111
35
27
15
250
3
+
+
−
−
−
+
= 0.429.
1000 1000 1000 1000 1000 1000 1000 So the desired probability is 0.571. 16. The number is relatively prime to 150 if is not divisible by 2, 3, or 5. Let A, B , and C be the
events that the number selected is divisible by 2, 3, and 5, respectively. We are interested in
P (Ac B c C c ) = 1 − P (A ∪ B ∪ C). Now
P (A ∪ B ∪ C) = P (A) + P (B) + P (C) − P (AB) − P (AC) − P (BC) + P (ABC)
50
30
25
15
10
5
11
75
+
+
−
−
−
+
=.
=
150 150 150 150 150 150 150
15
Therefore, the answer is 1 − 4
11
=.
15
15 cc
cc
cc
17. (a) Uic Dic ; (b) U1 U2 · · · Un ; (c) (U1 D1 ) ∪ (U2 D2 ) ∪ · · · ∪ (Un Dn ); (d) cc
cc
cc
cc
(U1 D2 U3 D3 ) ∪ (U1 U2 D2 D3 ) ∪ (D1 U2 U3 D3 ) ∪ (D1 U2 D2 U3 )
cc
cc
cccccc
∪(D1 U1 D2 U3 ) ∪ (D1 U1 U2 D2 ) ∪ (D1 U1 D2 U2 D3 U3 ); cc
c
(e) D1 D2 · · · Dn . 18. 103
199 − 96
=
.
199 − 0
199 19. We must have b2 < 4ac. There are 6 × 6 × 6 = 216 possible outcomes for a , b, and c. For cases in which a < c, a > c, and a = c, it can be checked that there are 73, 73, and 27 cases
in which b2 < 4ac, respectively. Therefore, the desired probability is
173
73 + 73 + 27
=
.
216
216 Chapter 2 C ombinatorial Methods
2.2 COUNTING PRINCIPLES 1. The total number of sixdigit numbers is 9 × 10 × 10 × 10 × 10 × 10 = 9 × 105 since the ﬁrst digit cannot be 0. The number of sixdigit numbers without the digit ﬁve is 8 × 9 × 9 × 9 × 9 × 9 =
8 × 95 . Hence there are 9 × 105 − 8 × 95 = 427, 608 sixdigit numbers that contain the digit
ﬁve. 2. (a) 55 = 3125. (b) 53 = 125. 3. There are 26 × 26 × 26 = 17, 576 distinct sets of initials. Hence in any town with more than
17,576 inhabitants, there are at least two persons with the same initials. The answer to the
question is therefore yes. 4. 415 = 1, 073, 741, 824.
5. 2
1
= 22 ≈ 0.00000024.
23
2
2 6. (a) 525 = 380, 204, 032. (b) 52 × 51 × 50 × 49 × 48 = 311, 875, 200.
7. 6/36 = 1/6.
8. (a)
9. 4×3×2×2
1
=.
12 × 8 × 8 × 4
64 (b) 1− 8×5×6×2
27
=
.
12 × 8 × 8 × 4
32 1
≈ 0.00000000093.
415 10. 26 × 25 × 24 × 10 × 9 × 8 = 11, 232, 000.
11. There are 263 × 102 = 1, 757, 600 such codes; so the answer is positive.
12. 2nm .
13. (2 + 1)(3 + 1)(2 + 1) = 36. (See the solution to Exercise 24.) 14 Chapter 2 Combinatorial Methods 14. There are (26 − 1)23 = 504 possible sandwiches. So the claim is true.
15. (a) 54 = 625. (b) 54 − 5 × 4 × 3 × 2 = 505.
16. 212 = 4096.
17. 1 − 48 × 48 × 48 × 48
= 0.274.
52 × 52 × 52 × 52 18. 10 × 9 × 8 × 7 = 5040.
19. 1 − (a) 9 × 9 × 8 × 7 = 4536; (b) 5040 − 1 × 1 × 8 × 7 = 4984. (N − 1)n
.
Nn 20. By Example 2.6, the probability is 0.507 that among Jenny and the next 22 people she meets
randomly there are two with the same birthday. However, it is quite possible that one of these
two persons is not Jenny. Let n be the minimum number of people Jenny must meet so that
the chances are better than even that someone shares her birthday. To ﬁnd n, let A denote the
event that among the next n people Jenny meets randomly someone’s birthday is the same as
Jenny’s. We have
364n
P (A) = 1 − P (Ac ) = 1 −
.
365n
To have P (A) > 1/2, we must ﬁnd the smallest n for which
1−
or 1
364n
>,
365n
2 364n
1
<.
n
365
2 This gives
1
2
= 252.652.
n>
364
log
365
Therefore, for the desired probability to be greater than 0.5, n must be 253. To some this might
seem counterintuitive.
log 21. Draw a tree diagram for the situation in which the salesperson goes from I to B ﬁrst. In
this situation, you will ﬁnd that in 7 out of 23 cases, she will end up staying at island I . By
symmetry, if she goes from I to H , D , or F ﬁrst, in each of these situations in 7 out of 23
cases she will end up staying at island I . So there are 4 × 23 = 92 cases altogether and in
4 × 7 = 28 of them the salesperson will end up staying at island I . Since 28/92 = 0.3043, the
answer is 30.43%. Note that the probability that the salesperson will end up staying at island
I is not 0.3043 because not all of the cases are equiprobable. Section 2.2 Counting Principle 15 22. He is at 0 ﬁrst, next he goes to 1 or −1. If at 1, then he goes to 0 or 2. If at −1, then he goes to 0 or −2, and so on. Draw a tree diagram. You will ﬁnd that after walking 4 blocks, he is at
one of the points 4, 2, 0, −2, or −4. There are 16 possible cases altogether. Of these 6 end up
at 0, none at 1, and none at −1. Therefore, the answer to (a) is 6/16 and the answer to (b) is 0. 23. We can think of a number less than 1,000,000 as a sixdigit number by allowing it to start with
0 or 0’s. With this convention, it should be clear that there are 96 such numbers without the
digit ﬁve. Hence the desired probability is 1 − (96 /106 ) = 0.469.
e
ee
24. Divisors of N are of the form p11 p22 · · · pkk , where ei = 0, 1, 2, . . . , ni , 1 ≤ i ≤ k. Therefore, the answer is (n1 + 1)(n2 + 1) · · · (nk + 1). 25. There are 64 possibilities altogether. In 54 of these possibilities there is no 3. In 53 of these
possibilities only the ﬁrst die lands 3. In 53 of these possibilities only the second die lands 3,
and so on. Therefore, the answer is
54 + 4 × 5 3
= 0.868.
64 26. Any subset of the set {salami, turkey, bologna, corned beef, ham, Swiss cheese, American
cheese} except the empty set can form a reasonable sandwich. There are 27 − 1 possibilities.
To every sandwich a subset of the set {lettuce, tomato, mayonnaise} can also be added. Since
there are 3 possibilities for bread, the ﬁnal answer is (27 − 1) × 23 × 3 = 3048 and the
advertisement is true. 27. 11 × 10 × 9 × 8 × 7 × 6 × 5 × 4
= 0.031.
118 28. For i = 1, 2, 3, let Ai be the event that no one departs at stop i. The desired quantity is
P (Ac Ac Ac ) = 1 − P (A1 ∪ A2 ∪ A3 ). Now
123 P (A1 ∪ A2 ∪ A3 ) = P (A1 ) + P (A2 ) + P (A3 )
− P (A1 A2 ) − P (A1 A3 ) − P (A2 A3 ) + P (A1 A2 A3 )
= 1
1
1
7
26 26 26
+ 6 + 6 − 6 − 6 − 6 +0= .
6
3
3
3
3
3
3
27 Therefore, the desired probability is 1 − (7/27) = 20/27. 29. For 0 ≤ i ≤ 9, the sum of the ﬁrst two digits is i in (i + 1) ways. Therefore, there are (i + 1)2
numbers in the given set with the sum of the ﬁrst two digits equal to the sum of the last two
digits and equal to i . For i = 10, there are 92 numbers in the given set with the sum of the ﬁrst
two digits equal to the sum of the last two digits and equal to 10. For i = 11, the corresponding
numbers are 82 and so on. Therefore, there are altogether
12 + 22 + · · · + 102 + 92 + 82 + · · · + 12 = 670 16 Chapter 2 Combinatorial Methods numbers with the desired probability and hence the answer is 670/104 = 0.067. 30. Let A be the event that the number selected contains at least one 0. Let B be the event that it
contains at least one 1 and C be the event that it contains at least one 2. The desired quantity
is P (ABC) = 1 − P (Ac ∪ B c ∪ C c ), where
P (Ac ∪ B c ∪ C c ) = P (Ac ) + P (B c ) + P (C c )
− P (Ac B c ) − P (Ac C c ) − P (B c C c ) + P (Ac B c C c )
= 8 × 9r −1
8 × 9r −1
8r
8r
9r
+
+
−
−
9 × 10r −1 9 × 10r −1 9 × 10r −1 9 × 10r −1 9 × 10r −1
− 2.3 7 × 8r −1
7r
+
.
9 × 10r −1 9 × 10r −1 PERMUTATIONS 1. The answer is 1
1
=
≈ 0.0417.
4!
24 2. 3! = 6.
3. 8!
= 56.
3! 5! 4. The probability that John will arrive right after Jim is 7!/8! (consider Jim and John as one
arrival). Therefore, the answer is 1 − (7!/8!) = 0.875. Another Solution: If Jim is the last person, John will not arrive after Jim. Therefore, the
remaining seven can arrive in 7! ways. If Jim is not the last person, the total number of
possibilities in which John will not arrive right after Jim is 7 × 6 × 6!. So the answer is
7! + 7 × 6 × 6!
= 0.875.
8! 5. (a) 312 = 531, 441. (b) 12!
= 924.
6! 6! (c) 12!
= 27, 720.
3! 4! 5! 6. 6 P2 = 30.
7. 20!
= 3, 491, 888, 400.
4! 3! 5! 8! 8. (5 × 4 × 7) × (4 × 3 × 6) × (3 × 2 × 5)
= 50, 400.
3! Section 2.3 Permutations 17 9. There are 8! schedule possibilities. By symmetry, in 8!/2 of them Dr. Richman’s lecture
precedes Dr. Chollet’s and in 8!/2 ways Dr. Richman’s lecture precedes Dr. Chollet’s. So the
answer is 8!/2 = 20, 160. 10. 11!
= 92, 400.
3! 2! 3! 3! 11. 1 − (6!/66 ) = 0.985.
12. (a) 11!
= 34, 650.
4! 4! 2 ! (b) Treating all P ’s as one entity, the answer is
(c) Treating all I ’s as one entity, the answer is 10!
= 6300.
4! 4!
8!
= 840.
4! 2 ! (d) Treating all P ’s as one entity, and all I ’s as another entity, the answer is 7!
= 210.
4! (e) By (a) and (c), The answer is 840/34650 = 0.024. 13. 8!
2! 3! 3! 68 = 0.000333. 14. 9!
3! 3! 3! 529 = 6.043 × 10−13 . 15. m!
.
(n + m)! 16. Each girl and each boy has the same chance of occupying the 13th chair. So the answer is
12/20 = 0.6. This can also be seen from 17. 12 × 19!
12
=
= 0.6.
20!
20 12!
= 0.000054.
1212 18. Look at the ﬁve math books as one entity. The answer is
19. 1 −
20. 9 P7
97 = 0.962. 2 × 5! × 5!
= 0.0079.
10! 21. n!/nn . 5! × 18!
= 0.00068.
22! 18 Chapter 2 Combinatorial Methods 22. 1 − (6!/66 ) = 0.985.
23. Suppose that A and B are not on speaking terms. 134 P4 committees can be formed in which
neither A serves nor B ; 4 ×134 P3 committees can be formed in which A serves and B does not.
The same numbers of committees can be formed in which B serves and A does not. Therefore,
the answer is 134 P4 + 2(4 ×134 P3 ) = 326, 998, 056. 24. (a) mn .
25. 3 ·
26. (a)
(b) (b) 8!
2! 3! 2! 1! m Pn . (c) n!. 68 = 0.003. 20!
= 7.61 × 10−6 .
39 × 37 × 35 × · · · × 5 × 3 × 1
1
= 3.13 × 10−24 .
39 × 37 × 35 × · · · × 5 × 3 × 1 27. Thirty people can sit in 30! ways at a round table. But for each way, if they rotate 30 times
(everybody move one chair to the left at a time) no new situations will be created. Thus in
30!/30 = 29! ways 15 married couples can sit at a round table. Think of each married couple
as one entity and note that in 15!/15 = 14! ways 15 such entities can sit at a round table. We
have that the 15 couples can sit at a round table in (2!)15 · 14! different ways because if the
couples of each entity change positions between themselves, a new situation will be created.
So the desired probability is
14!(2!)15
= 3.23 × 10−16 .
29!
The answer to the second part is
24!(2!)5
= 2.25 × 10−6 .
29! 28. In 13! ways the balls can be drawn one after another. The number of those in which the ﬁrst
white appears in the second or in the fourth or in the sixth or in the eighth draw is calculated
as follows. (These are Jack’s turns.)
8 × 5 × 11! + 8 × 7 × 6 × 5 × 9! + 8 × 7 × 6 × 5 × 4 × 5 × 7!
+ 8 × 7 × 6 × 5 × 4 × 3 × 2 × 5 × 5! = 2, 399, 846, 400.
Therefore, the answer is 2, 399, 846, 400/13! = 0.385. Section 2.4 2.4 1. 20
= 38, 760.
6 2.
i =51 4. 19 COMBINATIONS 100 3. Combinations 100
= 583, 379, 627, 841, 332, 604, 080, 945, 354, 060 ≈ 5.8 × 1029 .
i 20
6 25
= 6, 864, 396, 000.
6 12
3 40
2
52
5 = 0.066. 5. N −1
n−1 6. 5
3 2
= 10.
2 7. 8
3 5
2 8. 18
18
+
= 21, 624.
6
4 9. 10
5 N
n = n
.
N 3
= 560.
3 12
= 0.318.
7 10. The coefﬁcient of 23 x 9 in the expansion of (2 + x)12 is
is 23 12
. Therefore, the coefﬁcient of x 9
9 12
= 1760.
9 11. The coefﬁcient of (2x)3 (−4y)4 in the expansion of (2x − 4y)7 is
of x 3 y 2 in this expansion is 23 (−4)4 12. 9
3 6
6
+2
4
3 = 4620. 7
= 71, 680.
4 7
. Thus the coefﬁcient
4 20 Chapter 2 Combinatorial Methods
10 10
5 13. (a) 2 = 0.246; 10 (b)
i =5 10
i 210 = 0.623. 14. If their minimum is larger than 5, they are all from the set {6, 7, 8, . . . , 20}. Hence the answer
15
5 is 6
2 15. (a) 16. 50
5 n 28
4
34
6 150
45
200
50 n
=
2
i = 0.228; i =0
n x
i =0 i n
=
i 64
5
2 19. 212 (b) 6
6
10
12
+
+
+
6
6
6
6
34
6 = 0.00206. n i 17. 18. 20
= 0.194.
5 i =0
n
i =0 n i n−i
2 1 = (2 + 1)n = 3n .
i
n i n−i
x 1 = (x + 1)n .
i 66 = 0.201. 24
= 0.00151.
12 20. Royal Flush: Straight ﬂush: 4
52
5
36
52
5 = 0.0000015. = 0.000014. 13 × 12
Four of a kind: 52
5 4
1 = 0.00024. = 0.00084. Section 2.4 13
Full house: 4
Flush: Straight: 4
4
· 12
3
2
52
5 = 0.0014. 10(4)5 − 40
= 0.0039.
52
5 Three of a kind: 13
2 13 One pair: 21 13
− 40
5
= 0.002.
52
5 13 Two pairs: Combinations 4
2 4
12 2
·
4
3
2
= 0.021.
52
5
4
4
· 11
2
1
52
5 = 0.048. 4
12 3
·
4
2
3
= 0.42.
52
5 None of the above: 1− the sum of all of the above cases = 0.5034445. 21. The desired probability is
12
6 12
6
24
12 = 0.3157. x
3
x(x − 1)(x − 2) = 120 and its solution is x = 6. 22. The answer is the solution of the equation = 20. This equation is equivalent to 22 Chapter 2 Combinatorial Methods 23. There are 9×103 = 9000 fourdigit numbers. From every 4combination of the set {0, 1, . . . , 9},
exactly one fourdigit number can be constructed in which its ones place is less than its tens
place, its tens place is less than its hundreds place, and its hundreds place is less than its
10
thousands place. Therefore, the number of such fourdigit numbers is
= 210. Hence
4
the desired probability is 0.023333. 24.
(x + y + z)2 =
n1 +n2 +n3 = n!
x n1 y n2 zn3
n1 ! n2 ! n3 !
=2 2!
2!
2!
x 2 y 0 z0 +
x 0 y 2 z0 +
x 0 y 0 z2
2! 0! 0!
0! 2! 0!
0! 0! 2!
+ 2!
2!
2!
x 1 y 1 z0 +
x 1 y 0 z1 +
x 0 y 1 z1
1! 1! 0!
1! 0! 1!
0! 1! 1! = x 2 + y 2 + z2 + 2xy + 2xz + 2yz. 25. The coefﬁcient of (2x)2 (−y)3 (3z)2 in the expansion of (2x − y + 3z)7 is
coefﬁcient of x 2 y 3 z2 in this expansion is 22 (−1)3 (3)2 7!
= −7560.
2! 3! 2! 7!
. Thus the
2! 3! 2! 13!
. Therefore,
3! 7! 3!
= −7, 413, 120. 26. The coefﬁcient of (2x)3 (−y)7 (3)3 in the expansion of (2x − y + 3)13 is
the coefﬁcient of x 3 y 7 in this expansion is 23 (−1)7 (3)3 13!
3! 7! 3! 52!
52!
ways 52 cards can be dealt among four people. Hence the sample
=
13! 13! 13! 13!
(13!)4
space contains 52!/(13!)4 points. Now in 4! ways the four different suits can be distributed
among the players; thus the desired probability is 4!/[52!/(13!)4 ] ≈ 4.47 × 10−28 . 27. In 28. The theorem is valid for k = 2; it is the binomial expansion. Suppose that it is true for all
integers ≤ k − 1. We show it for k . By the binomial expansion,
n (x1 + x2 + · · · + xk ) =
n n1 =0
n =
n1 =0 n n1
x
n1 1 n =
n1 +n2 +···+nk =n n n1
x (x2 + · · · + xk )n−n1
n1 1 2 +n3 +···+nk =n−n1 (n − n1 )!
n
x n2 x n3 · · · xk k
n2 ! n3 ! · · · nk ! 2 3 n
(n − n1 )!
n
x n1 x n2 · · · xk k
n1 n2 ! n3 ! · · · nk ! 1 2 Section 2.4 =
n1 +n2 +···+nk Combinations 23 n!
n
x n1 x n2 · · · xk k .
n1 ! n2 ! · · · nk ! 1 2
=n 29. We must have 8 steps. Since the distance from M to L is ten 5centimeter intervals and the
ﬁrst step is made at M, there are 9 spots left at which the remaining 7 steps can be made. So
9
the answer is
= 36.
7 30. (a) 2
1 98
98
+
49
48
100
50 = 0.753; (b) 250 100
= 1.16 × 10−14 .
50 31. (a) It must be clear that
n
2
n1
+ nn1
n2 =
2
n2
+ n2 (n + n1 )
n3 =
2
n3
+ n3 (n + n1 + n2 )
n4 =
2
.
.
.
nk−1
nk =
+ nk−1 (n + n1 + · · · + nk−1 ).
2
n1 = (b) For n = 25, 000, successive calculations of nk ’s yield,
n1 = 312, 487, 500,
n2 = 48, 832, 030, 859, 381, 250,
n3 = 1, 192, 283, 634, 186, 401, 370, 231, 933, 886, 715, 625,
n4 = 710, 770, 132, 174, 366, 339, 321, 713, 883, 042, 336, 781, 236,
550, 151, 462, 446, 793, 456, 831, 056, 250.
For n = 25, 000, the total number of all possible hybrids in the ﬁrst four generations,
n1 + n2 + n3 + n4 , is 710,770,132,174,366,339,321,713,883,042,337,973,520,184,337,
863,865,857,421,889,665,625. This number is approximately 710 × 1063 . 32. For n = 1, we have the trivial identity
x+y = 1 0 1−0
1 1 1−1
xy
+
xy .
0
1 24 Chapter 2 Combinatorial Methods Assume that
(x + y)n−1 = n−1
i =0 n − 1 i n−1−i
.
xy
i This gives
n−1 (x + y)n = (x + y)
i =0
n−1 n − 1 i n−1−i
xy
i n − 1 i +1 n−1−i
+
xy
i =
i =0
n n − 1 i n−i
+
xy
i−1 =
i =1 n−1
i =1
n−1
i =1 i =0 n − 1 i n−i
xy
i i =0 n − 1 i n−i
xy
i n−1
n−1
+
i−1
i =x +
n = xn + n−1 n−1 n i n−i
+ yn =
xy
i x i y n−i + y n
n i =0 n i n−i
xy .
i 33. The desired probability is computed as follows.
12
6 34. (a) (c) 35. 30
2 28
2 26
2 10 6
2
6
= 0.347;
20
6
10
2 26
13 26
13
52
26 24
2 10
1 (b) 82
2
2
= 0.130;
20
6 = 0.218. 22
2 (d) 20
2 18
3 15
3 94
2
4
= 0.520;
20
6
10
3
20
6 = 0.0031. 12
3 9
3 6
3 3
3 1230 ≈ 0.000346. Section 2.4 Combinations 25 36. Let a 6element combination of a set of integers be denoted by {a1 , a2 , . . . , a6 }, where a1 <
a2 < · · · < a6 . It can be easily veriﬁed that the function h : B → A deﬁned by
h {a1 , a2 , . . . , a6 } = {a1 , a2 + 1, . . . , a6 + 5}
is onetoone and onto. Therefore, there is a onetoone correspondence between B and
44
A . This shows that the number of elements in A is
. Thus the probability that no
6
44
49
consecutive integers are selected among the winning numbers is
≈ 0.505. This
6
6
implies that the probability of at least two consecutive integers among the winning numbers
is approximately 1 − 0.505 = 0.495. Given that there are 47 integers between 1 and 49, this
high probability might be counterintuitive. Even without knowledge of expected value, a
keen student might observe that, on the average, there should be (49 − 1)/7 = 6.86 numbers
between each ai and ai +1 , 1 ≤ i ≤ 5. Thus he or she might erroneously think that it is unlikely
to obtain consecutive integers frequently. 37. (a) Let Ei be the event that car i remains unoccupied. The desired probability is
cc
c
P (E1 E2 · · · En ) = 1 − P (E1 ∪ E2 ∪ · · · ∪ En ). Clearly,
P (Ei ) = (n − 1)m
,
nm 1 ≤ i ≤ n; P (Ei Ej ) = (n − 2)m
,
nm 1 ≤ i, j ≤ n, i = j ; P (Ei Ej Ek ) = (n − 3)m
,
nm 1 ≤ i, j, k ≤ n, i = j = k ; and so on. Therefore, by the inclusionexclusion principle,
n (−1)i −1 P (E1 ∪ E2 ∪ · · · ∪ En ) =
i =1 n (n − i)m
.
i
nm So
n (−1)i −1 cc
c
P (E1 E2 · · · En ) = 1 −
i =1 = 1
nm n (−1)i
i =0 n (n − i)m
=
i
nm n (−1)i
i =0 n (n − i)m
i
nm n
(n − i)m .
i Let F be the event that cars 1, 2, . . . , n − r are all occupied and the remaining cars are
n
unoccupied. The desired probability is
P (F ). Now by part (a), the number of ways m
r (b) 26 Chapter 2 Combinatorial Methods passengers can be distributed among n − r cars, no car remaining unoccupied is
n−r (−1)i
i =0 n−r
(n − r − i)m .
i So
P (F ) = 1
nm n−r (−1)i
i =0 n−r
(n − r − i)m
i and hence the desired probability is 1n
nm r n−r (−1)i
i =0 n−r
(n − r − i)m .
i 38. Let the n indistinguishable balls be represented by n identical oranges and the n distinguishable
cells be represented by n persons. We should count the number of different ways that the n
oranges can be divided among the n persons, and the number of different ways in which exactly
one person does not get an orange. The answer to the latter part is n(n − 1) since in this case
one person does not get an orange, one person gets exactly two oranges, and the remaining
persons each get exactly one orange. There are n choices for the person who does not get
an orange and n − 1 choices for the person who gets exactly two oranges; n(n − 1) choices
altogether. To count the number of different ways that the n oranges can be divided among the
n persons, add n − 1 identical apples to the oranges and note that by Theorem 2.4, the total
(2n − 1)!
number of permutations of these n − 1 apples and n oranges is
. (We can arrange
n! (n − 1)!
n − 1 identical apples and n identical oranges in a row in (2n − 1)!/ n! (n − 1)! ways.) Now
(2n − 1)!
2n − 1
each one of these
=
permutations corresponds to a way of dividing the
n! (n − 1)!
n
n oranges among the n persons and vice versa. Give all of the oranges preceding the ﬁrst apple
to the ﬁrst person, the oranges between the ﬁrst and the second apples to the second person,
the oranges between the second and the third apples to the third person and so on. Therefore,
if, for example, an apple appears in the beginning of the permutation, the ﬁrst person does not
get an orange, and if two apples are at the end of the permutations, the (n − 1)st and the nth
2n − 1
persons get no oranges. Thus the answer is n(n − 1)
.
n 39. The left side of the identity is the binomial expansion of (1 − 1)n = 0. Section 2.4 Combinations 27 40. Using the hint, we have
n
n+1
n+2
n+r
+
+
+ ··· +
0
1
2
r
= n
n+2
n+1
n+3
n+2
+
−
+
−
0
1
0
2
1
+ = n+4
n+3
n+r +1
n+r
−
+ ··· +
−
3
2
r
r −1 n
n+1
n+r +1
n+r +1
−
+
=
.
0
0
r
r 41. The identity expresses that to choose r balls from n red and m blue balls, we must choose
either r red balls, 0 blue balls or r − 1 red balls, one blue ball or r − 2 red balls, two blue balls
or · · · 0 red balls, r blue balls. 42. Note that 1
1
n
n+1
=
. Hence
i+1 i
n+1 i+1 The given sum = 43. 53
3
2 44. (a) PN = 1
n+1 n+1
n+1
n+1
+
+ ··· +
1
2
n+1 = 1
(2n+1 − 1).
n+1 45 = 0.264.
t
m N −t
n−m
.
N
n (b) From part (a), we have
PN
(N − t)(N − n)
=
.
PN −1
N(N − t − n + m)
This implies PN > PN −1 if and only if (N − t)(N − n) > N (N − t − n + m) or, equivalently,
if and only if N ≤ nt/m. So PN is increasing if and only if N ≤ nt/m. This shows that the
maximum of PN is at [nt/m], where by [nt/m] we mean the greatest integer ≤ nt/m. 45. The sample space consists of (n + 1)4 elements. Let the elements of the sample be denoted by x1 , x2 , x3 , and x4 . To count the number of samples (x1 , x2 , x3 , x4 ) for which x1 + x2 = x3 + x4 ,
let y3 = n − x3 and y4 = n − x4 . Then y3 and y4 are also random elements from the set
{0, 1, 2, . . . , n}. The number of cases in which x1 + x2 = x3 + x4 is identical to the number of
cases in which x1 + x2 + y3 + y4 = 2n. By Example 2.23, the number of nonnegative integer 28 Chapter 2 Combinatorial Methods 2n + 3
. However, this also counts the solutions in which one
3
of x1 , x2 , y3 , and y4 is greater than n. Because of the restrictions 0 ≤ x1 , x2 , y3 , y4 ≤ n,
we must subtract, from this number, the total number of the solutions in which one of x1 , x2 ,
y3 , and y4 is greater than n. Such solutions are obtained by ﬁnding all nonnegative integer
solutions of the equation x1 + x2 + y3 + y4 = n − 1, and then adding n + 1 to exactly one
of x1 , x2 , y3 , and y4 . Their count is 4 times the number of nonnegative integer solutions of
n+2
x1 + x2 + y3 + y4 = n − 1; that is, 4
. Therefore, the desired probability is
3
solutions to this equation is 2n + 3
n+2
−4
3
3
4
(n + 1) = 2n2 + 4n + 3
.
3(n + 1)3 46. (a) The n − m unqualiﬁed applicants are “ringers.” The experiment is not affected by their
inclusion, so that the probability of any one of the qualiﬁed applicants being selected is the
same as it would be if there were only qualiﬁed applicants. That is, 1/m. This is because in
a random arrangement of m qualiﬁed applicants, the probability that a given applicant is the
ﬁrst one is 1/m.
(b) Let A be the event that a given qualiﬁed applicant is hired. We will show that P (A) =
1/m. Let Ei be the event that the given qualiﬁed applicant is the i th applicant interviewed,
and he or she is the ﬁrst qualiﬁed applicant to be interviewed. Clearly,
n−m+1 P (A) = P (Ei ),
i =1 where
P (Ei ) = n−m Pi −1 · 1 · (n − i)!
.
n! Therefore,
n−m+1 P (A) =
i =1
n−m+1 = · (n − i)!
n! n−m Pi −1 (n − m)!
(n − i)!
(n − m − i + 1)!
n! i =1
n−m+1 =
i =1
n−m+1 =
i =1 1
·
m! 1
(n − i)!
(m − 1)!
·
(n − m − i + 1)! (m − 1)!
n!
m! (n − m)! 1
·
m 1
n
m n−i
m−1 Section 2.4 1
=
·
m
n−m+1 To calculate
i =1
n−i of (1 + x) n−m+1 1
n
m Combinations n−i
.
m−1 i =1 (4) n−i
n−i
, note that
is the coefﬁcient of x m−1 in the expansion
m−1
m−1
n−m+1 n−i
is the coefﬁcient of x m−1 in the expansion of
m−1 . Therefore,
i =1
n−m+1 (1 + x)n−i = i =1
n−m+1 This shows that
i =1 n−i
m−1 (1 + x)n − (1 + x)m−1 , which is (1 + x)n − (1 + x)m−1
.
x is the coefﬁcient of x m in the expansion of
n
. So (4) implies that
m P (A) = 1
n
m 1
·
m · n
1
=.
m
m 6
equal
1 47. Clearly, N = 610 , N(Ai ) = 510 , N (Ai Aj ) = 410 , i = j , and so on. So S1 has
terms, S2 has
610 − 48. A0  = 6
equal terms, and so on. Therefore, the solution is
2 6 10
6 10
6 10
6 10
6 10
6 10
5+
4−
3+
2−
1+
0 = 16, 435, 440.
1
2
3
4
5
6 1n
23 29 n−3
,
3 The answer is A1  = 1n
23 n−3
,
2 3
1 A2  = 1n
23 3
2 n−3
.
1 A0 
(n − 4)(n − 5)
=
.
A0  + A1  + A2 
n2 + 2 49. The coefﬁcient of x n in (1 + x)2n is
n
0 2n
. Its coefﬁcient in (1 + x)n (1 + x)n is
n n
n
+
n
1
= n
0 2 + n
n
+
n−1
2
n
1 2 + n
2 n
n
+ ··· +
n−2
n 2 + ··· + n
n 2 , n
0 30 Chapter 2 since Combinatorial Methods n
n
=
, 0 ≤ i ≤ n.
i
n−1 50. Consider a particular set of k letters. Let M be the number of possibilities in which only
n
M n!. All
k
we got to do is to ﬁnd M. To do so, note that the remaining n − k letters are all addressed
incorrectly. For these n − k letters, there are n − k addresses. But the addresses are written
on the envelopes at random. The probability that none is addressed correctly on one hand is
M/(n − k)!, and on the other hand, by Example 2.24, is
these k letters are addressed correctly. The desired probability is the quantity n−k 1−
i =1 n (−1)i −1
=
i! So M satisﬁes
M
=
(n − k)! n
i =2 and hence i =2 (−1)i −1
.
i! (−1)i −1
,
i! n M = (n − k)!
i =2 (−1)i −1
.
i! The ﬁnal answer is
n
M
k
=
n! n
(n − k)!
k n
i =2 n! (−1)i −1
i! = 1
k! n
i =2 (−1)i −1
.
i! 51. The set of all sequences of H’s and T’s of length i with no successive H’s are obtained either
by adding a T to the tails of all such sequences of length i − 1, or a TH to the tails of all such
sequences of length i − 2. Therefore,
xi = xi −1 + xi −2 , i ≥ 2.
Clearly, x1 = 2 and x3 = 3. For consistency, we deﬁne x0 = 1. From the theory of recurrence
i
relations we know that the solution of xi = xi −1 + xi −2 is of the form xi = Ar1 +√ 2 , where
Br i
√
1+ 5
1− 5
and r2 =
and so
r1 and r2 are the solutions of r 2 = r + 1. Therefore, r1 =
2
2
√
√
1+ 5 i
1− 5 i
xi = A
+B
.
2
2
√
√
5+3 5
5−3 5
Using the initial conditions x0 = 1 and x2 = 2, we obtain A =
and B =
.
10
10 Section 2.5 Hence the answer is
xn
1
n= n
2
2
= Stirling’s Formula 31 √
5+3 5
10 1
10 × 22n √
√
√
1+ 5 n
5−3 5 1− 5 n
+
2
10
2
√
√n
√
√n
5+3 5 1+ 5 + 5−3 5 1− 5 . 52. For this exercise, a solution is given by Abramson and Moser in the October 1970 issue of the
American Mathematical Monthly. 2.5 STIRLING’s FORMULA 1. (a)
(b) √
4π n (2n)2n e−2n
2n 1
(2n)! 1
1
=
∼
∼√ .
2n
2n
2n e−2n 22n
n! n! 2
(2π n) n
n2
πn
√
√
3
3
4π n (2n)2n e−2n
(2n)!
2
∼√
= n.
2
4n e−4n (2π n) n2n e−2n
(4n)! (n!)
4
8π n (4n) REVIEW PROBLEMS FOR CHAPTER 2
1. The desired quantity is equal to the number of subsets of all seven varieties of fruit minus 1
(the empty set); so it is 27 − 1 = 127. 2. The number of choices Virginia has is equal to the number of subsets of {1, 2, 5, 10, 20} minus
1 (for empty set). So the answer is 25 − 1 = 31. 3. (6 × 5 × 4 × 3)/64 = 0.278.
4. 10
5. 10
= 0.222.
2 9!
= 7560.
3! 2! 2! 2! 6. 5!/5 = 4! = 24.
7. 3! · 4! · 4! · 4! = 82, 944. 8. 1 − 23
6
30
6 = 0.83. 32 Chapter 2 Combinatorial Methods 9. Since the refrigerators are identical, the answer is 1.
10. 6! = 720.
11. (Draw a tree diagram.) In 18 out of 52 possible cases the tournament ends because John wins
4 games without winning 3 in a row. So the answer is 34.62%. 12. Yes, it is because the probability of what happened is 1/72 = 0.02.
13. 9 8 = 43, 046, 721.
14. (a) 26 × 25 × 24 × 23 × 22 × 21 = 165, 765, 600;
(b) 26 × 25 × 24 × 23 × 22 × 5 = 39, 468, 000;
(c) 15. 5
3
2
1
26
25
24
23 = 21, 528, 000.
2
1
1
1 6
6
6
6
+
+
+
3
1
1
1
10
3 Another Solution: 16. 2
1 6
6
+
3
1
10
3 4
2 = 0.467. = 0.467. 8 × 4 ×6 P4
= 0.571.
8 P6 17. 1 −
18. 2
1 278
= 0.252.
288 (3!/3)(5!)3
= 0.000396.
15!/15 19. 312 = 531, 441. 20. 4
1 48
12 3
1 36
12 2
1 24
12 52!
13! 13! 13! 13! 1
1 12
12 = 0.1055. Chapter 2 Review Problems 33 21. Let A1 , A2 , A3 , and A4 be the events that there is no professor, no associate professor, no
assistant professor, and no instructor in the committee, respectively. The desired probability
is
P (Ac Ac Ac Ac ) = 1 − P (A1 ∪ A2 ∪ A3 ∪ A4 ),
1234
where P (A1 ∪ A2 ∪ A3 ∪ A4 ) is calculated using the inclusionexclusion principle:
P (A1 ∪ A2 ∪ A3 ∪ A4 ) = P (A1 ) + P (A2 ) + P (A3 ) + P (A4 )
− P (A1 A2 ) − P (A1 A3 ) − P (A1 A4 ) − P (A2 A3 ) − P (A2 A4 ) − P (A3 A4 )
+ P (A1 A2 A3 ) + P (A1 A3 A4 ) + P (A1 A2 A4 ) + P (A2 A3 A4 ) − P (A1 A2 A3 A4 )
=1
− 34
6 28
28
24
22
22
18
16
18
+
+
+
−
−
−
−
6
6
6
6
6
6
6
6 16
12
12
6
10
6
−
+
+
+
+
− 0 = 0.621.
6
6
6
6
6
6 Therefore, the desired probability equals 1 − 0.621 = 0.379. 22. (15!)2
= 0.0002112.
30!/(2!)15 23. (N − n + 1) 24. (a) (c) 4
2 48
24
52
26 13
5 39
8
52
13 N
.
n = 0.390; 8
8
39
13 31
5 (b) 40
1
52
13 = 6.299 × 10−11 ; = 0.00000261. 25. 12!/(3!)4 = 369, 600.
26. There is a onetoone correspondence between all cases in which the eighth outcome obtained
is not a repetition and all cases in which the ﬁrst outcome obtained will not be repeated. The
answer is
6×5×5×5×5×5×5×5
57
= 0.279.
=
6
6×6×6×6×6×6×6×6 27. There are 9 × 103 = 9, 000 fourdigit numbers. To count the number of desired fourdigit
numbers, note that if 0 is to be one of the digits, then the thousands place of the number must be 34 Chapter 2 Combinatorial Methods 0, but this cannot be the case since the ﬁrst digit of an ndigit number is nonzero. Keeping this
in mind, it must be clear that from every 4combination of the set {1, 2, . . . , 9}, exactly one
fourdigit number can be constructed in which its ones place is greater than its tens place, its
tens place is greater than it hundreds place, and its hundreds place is greater than its thousands
9
place. Therefore, the number of such fourdigit numbers is
= 126. Hence the desired
4
probability is = 0.014. 28. Since the sum of the digits of 100,000 is 1, we ignore 100,000 and assume that all of the numbers
have ﬁve digits by placing 0’s in front of those with less than ﬁve digits. The following process
establishes a onetoone correspondence between such numbers, d1 d2 d3 d4 d5 , 5=1 di = 8,
i
and placement of 8 identical objects into 5 distinguishable cells: Put d1 of the objects into
the ﬁrst cell, d2 of the objects into the second cell, d3 into the third cell, and so on. Since
8+5−1
12
this can be done in
=
= 495 ways, the number of integers from the set
5−1
8
{1, 2, 3, . . . , 100000} in which the sum of the digits is 8 is 495. Hence the desired probability
is 495/100, 000 = 0.00495. Chapter 3 C onditional Probability
and I ndependence
3.1 CONDITIONAL PROBABILITY 1. P (W  U ) = P (U W )
0.15
=
= 0.60.
P (U )
0.25 2. Let E be the event that in the blood of the randomly selected soldier A antigen is found. Let
F be the event that the blood type of the soldier is A. We have
P (F  E) = 3. P (F E)
0.41
=
= 0.911.
P (E)
0.41 + 0.04 0.20
= 0.625.
0.32 4. The reduced sample space is (1, 4), (2, 3), (3, 2), (4, 1), (4, 6), (5, 5), (6, 4) ; therefore, the
desired probability is 1/7. 5. 2
30 − 20
=.
30 − 15
3 6. Both of the inequalities are equivalent to P (AB) > P (A)P (B).
7. 2
1/3
=.
(1/3) + (1/2)
5 8. 4/30 = 0.133. 36 Chapter 3 Conditional Probability and Independence 40 65
2
6
105
8 = 0.239.
40
65
2
8−i
i
1−
105
i =0
8
⎧
⎪1/19 if i = 0
⎪
⎨
10. P (α = i  β = 0) = 2/19 if i = 1, 2, 3, . . . , 9
⎪
⎪
⎩
0
if i = 10, 11, 12, . . . , 18. 9. 11. Let b∗ gb mean that the oldest child of the family is a boy, the second oldest is a girl, the youngest
is a boy, and the boy found in the family is the oldest child, with similar representations for
other cases. The reduced sample space is
S = ggb∗ , gb∗ g, b∗ gg, b∗ bg, bb∗ g, gb∗ b, gbb∗ , bgb∗ , b∗ gb, b∗ bb, bb∗ b, bbb∗ .
Note that the outcomes of the sample space are not equiprobable. We have that
P {ggb∗ } = P {gb∗ g } = P {b∗ gg } = 1/7
P {b∗ bg } = P {bb∗ g } = 1/14
P {gb∗ b} = P {gbb∗ } = 1/14
P {bgb∗ } = P {b∗ gb} = 1/14
P {b∗ bb} = P {bb∗ b} = P {bbb∗ } = 1/21.
The solutions to (a), (b), (c) are as follows. (a) P {bb∗ g } = 1/14;
(b) P {bb∗ g, gbb∗ , bgb∗ , bb∗ b, bbb∗ } = 13/42;
(c) P {b∗ bg, bb∗ g, gb∗ b, gbb∗ , bgb∗ , b∗ gb} = 3/7.
12. P (A) = 1 implies that P (A ∪ B) = 1. Hence, by
P (A ∪ B) = P (A) + P (B) − P (AB),
we have that P (B) = P (AB). Therefore,
P (B  A) = P (B)
P (AB)
=
= P (B).
P (A)
1 Section 3.1 Conditional Probability 37 P (AB)
, where
b
P (AB) = P (A) + P (B) − P (A ∪ B) ≥ P (A) + P (B) − 1 = a + b − 1. 13. P (A  B) = 14. (a) P (AB) ≥ 0, P (B) > 0. Therefore, P (A  B) =
(b) P (S  B) = P (SB)
P (B)
=
= 1.
P (B)
P (B)
∞
i =1 P ∞ Ai B = (c) P P (AB)
≥ 0.
P (B) i =1 Ai B
= P (B) P ∞
i =1 Ai B P (B) ∞ P (Ai B)
= i =1 P (B) ∞ =
i =1 P (Ai B)
=
P (B) ∞ P (Ai  B).
i =1 Note that P (∪∞ 1 Ai B) = ∞ 1 P (Ai B), since mutual exclusiveness of Ai ’s imply that of
i=
i=
Ai B ’s; i.e., Ai Aj = ∅, i = j , implies that (Ai B)(Aj B) = ∅, i = j . 15. The given inequalities imply that P (EF ) ≥ P (GF ) and P (EF c ) ≥ P (GF c ). Thus
P (E) = P (EF ) + P (EF c ) ≥ P (GF ) + P (GF c ) = P (G). 16. Reduce the sample space: Marlon chooses from six dramas and seven comedies two at random.
What is the probability that they are both comedies? The answer is 7
2 13
= 0.269.
2 17. Reduce the sample space: There are 21 crayons of which three are red. Seven of these crayons
are selected at random and given to Marty. What is the probability that three of them are red?
18
21
The answer is
= 0.0263.
4
7 18. (a) The reduced sample space is S = {1, 3, 5, 7, 9, . . . , 9999}. There are 5000 elements in S . Since the set {5, 7, 9, 11, 13, 15, . . . , 9999} includes exactly 4998/3 = 1666 odd numbers
that are divisible by three, the reduced sample space has 1667 odd numbers that are divisible
by 3. So the answer is 1667/5000 = 0.3334.
(b) Let O be the event that the number selected at random is odd. Let F be the event that it is
divisible by 5 and T be the event that it is divisible by 3. The desired probability is calculated
as follows.
P (F c T c  O) = 1 − P (F ∪ T  O) = 1 − P (F  O) − P (T  O) + P (F T  O)
=1− 1000 1667
333
−
+
= 0.5332.
5000 5000 5000 38 Chapter 3 Conditional Probability and Independence 19. Let A be the event that during this period he has hiked in Oregon Ridge Park at least once. Let
B be the event that during this period he has hiked in this park at least twice. We have
P (B  A) =
where
P (A) = 1 −
and
P (B) = 1 − P (B)
,
P (A) 510
= 0.838
610 510 10 × 59
−
= 0.515.
610
610 So the answer is 0.515/0.838 = 0.615. 20. The numbers of 333 red and 583 blue chips are divisible by 3. Thus the reduced sample space
has 333 + 583 = 916 points. Of these numbers, [1000/15] = 66 belong to red balls and
are divisible by 5 and [1750/15] = 116 belong to blue balls and are divisible by 5. Thus the
desired probability is 182/916 = 0.199. 21. Reduce the sample space: There are two types of animals in a laboratory, 15 type I and 13
type II. Six animals are selected at random; what is the probability that at least two of them
are Type II? The answer is 1− 15
13
+
6
1
28
6 15
5 = 0.883. 22. Reduce the sample space: 30 students of which 12 are French and nine are Korean are divided
randomly into two classes of 15 each. What is the probability that one of them has exactly
four French and exactly three Korean students? The solution to this problem is
12 9 9
4
38
30 15
15 15 = 0.00241. 23. This sounds puzzling because apparently the only deduction from the name “Mary” is that one
of the children is a girl. But the crucial difference between this and Example 3.2 is reﬂected
in the implicit assumption that both girls cannot be Mary. That is, the same name cannot be
used for two children in the same family. In fact, any other identifying feature that cannot be
shared by both girls would do the trick. Section 3.2 3.2 Law of Multiplication 39 LAW OF MULTIPLICATION 1. Let G be the event that Susan is guilty. Let L be the event that Robert will lie. The probability
that Robert will commit perjury is
P (GL) = P (G)P (L  G) = (0.65)(0.25) = 0.1625. 2. The answer is 11 10
9
8
7
6
×
×
×
×
× = 0.15.
14 13 12 11 10 9 3. By the law of multiplication, the answer is
52 50 48 46 44 42
×
×
×
×
×
= 0.72.
52 51 50 49 48 47 4. (a)
(b) 5. (a)
(b) 6. 7
6
8
5
×
×
×
= 0.0144;
20 19 18 17
8
7
12
8
12
7
12
8
7
8
7
6
×
×
+
×
×
+
×
×
+
×
×
= 0.344.
20 19 18 20 19 18 20 19 18 20 19 18
5
544332211
6
×
× × × × × × × × × = 0.00216.
11 10 9 8 7 6 5 4 3 2 1
5
4
321
×
× × × = 0.00216.
11 10 9 8 7 5
5
8
5
3
8
5
3
×
×
×
+×
×
×
= 0.0712.
8 10 13 15 8 11 13 16 7. Let Ai be the event that the i th person draws the “you lose” paper. Clearly,
P (A1 ) = 1
,
200 1
199 1
·
=
,
200 199
200
199 198 1
1
P (A3 ) = P (Ac Ac A3 ) = P (Ac )P (Ac  Ac )P (A3  Ac Ac ) =
·
·
=
,
12
1
2
1
12
200 199 198
200 P (A2 ) = P (Ac A2 ) = P (Ac )P (A2  Ac ) =
1
1
1 and so on. Therefore, P (Ai ) = 1/200 for 1 ≤ i ≤ 200. This means that it makes no difference
if you draw ﬁrst, last or anywhere in the middle. Here is Marilyn Vos Savant’s intuitive solution
to this problem: 40 Chapter 3 Conditional Probability and Independence It makes no difference if you draw ﬁrst, last, or anywhere in the middle. Look at it
this way: Say the robbers make everyone draw at once. You’d agree that everyone
has the same change of losing (one in 200), right? Taking turns just makes that
same event happen in a slow and orderly fashion. Envision a rafﬂe at a church with
200 people in attendance, each person buys a ticket. Some buy a ticket when they
arrive, some during the event, and some just before the winner is drawn. It doesn’t
matter. At the party the end result is this: all 200 guests draw a slip of paper, and,
regardless of when they look at the slips, the result will be identical: one will lose.
You can’t alter your chances by looking at your slip before anyone else does, or
waiting until everyone else has looked at theirs. 8. Let B be the event that a randomly selected person from the population at large has poor credit
report. Let I be the event that the person selected at random will improve his or her credit
rating within the next three years. We have
P (B  I ) = P (BI )
P (I  B)P (B)
(0.30)(0.18)
=
=
= 0.072.
P (I )
P (I )
0.75 The desired probability is 1 − 0.072 = 0.928. Therefore, 92.8% of the people who will improve
their credit records within the next three years are the ones with good credit ratings. 9. For 1 ≤ n ≤ 39, let En be the event that none of the ﬁrst n − 1 cards is a heart or the ace
of spades. Let Fn be the event that the nth card drawn is the ace of spades. Then the event
of “no heart before the ace of spades” is 39 1 En Fn . Clearly, {En Fn , 1 ≤ n ≤ 39} forms a
n=
sequence of mutually exclusive events. Hence
39 39 n=1 39 P (En Fn ) = En Fn = P n=1
39 =
n=1 P (En )P (Fn  En )
n=1 38
n−1
52
n−1 × 1
1
=,
53 − n
14 a result which is not unexpected. 10. P (F )P (E  F ) = 13
3 39
6
52
9 × 10
= 0.059.
43 11. By the law of multiplication,
P (An ) = 234
n+1
2
× × × ··· ×
=
.
345
n+2
n+2 Section 3.3 Law of Total Probability 41 Now since A1 ⊇ A2 ⊇ A3 ⊇ · · · ⊇ An ⊇ An+1 ⊇ · · · , by Theorem 1.8,
∞ Ai = lim P (An ) = 0. P n→∞ i =1 3.3 1. LAW OF TOTAL PROBABILITY
1
1
× 0.05 + × 0.0025 = 0.02625.
2
2 2. (0.16)(0.60) + (0.20)(0.40) = 0.176.
3. 1
1
1
(0.75) + (0.68) + (0.47) = 0.633.
3
3
3 4. 1
12 13 13 39
×
+
×
=.
51 52 51 52
4 5. 11
×
50 13
2
52
2 + 12
×
50 13
1 39
1
52
2 + 13
×
50 39
2
52
2 1
=.
4 6. (0.20)(0.40) + (0.35)(0.60) = 0.290.
7. (0.37)(0.80) + (0.63)(0.65) = 0.7055.
8. 1
1
1
1
1
1
(0.6) + (0.5) + (0.7) + (0.9) + (0.7) + (0.8) = 0.7.
6
6
6
6
6
6 9. (0.50)(0.04) + (0.30)(0.02) + (0.20)(0.04) = 0.034.
10. Let B be the event that the randomly selected child from the countryside is a boy. Let E be
the event that the randomly selected child is the ﬁrst child of the family and F be the event
that he or she is the second child of the family. Clearly, P (E) = 2/3 and P (F ) = 1/3. By
the law of total probability,
P (B) = P (B  E)P (E) + P (B  F )P (F ) = 1211
1
×+×=.
2323
2 Therefore, assuming that sex distributions are equally probable, in the Chinese countryside,
the distribution of sexes will remain equal. Here is Marilyn Vos Savant’s intuitive solution to
this problem: 42 Chapter 3 Conditional Probability and Independence The distribution of sexes will remain roughly equal. That’s because–no matter how
many or how few children are born anywhere, anytime, with or without restriction–
half will be boys and half will be girls: Only the act of conception (not the government!) determines their sex.
One can demonstrate this mathematically. (In this example, we’ll assume that
women with ﬁrstborn girls will always have a second child.) Let’s say 100 women
give birth, half to boys and half to girls. The half with boys must end their families.
There are now 50 boys and 50 girls. The half with girls (50) give birth again, half
to boys and half to girls. This adds 25 boys and 25 girls, so there are now 75 boys
and 75 girls. Now all must end their families. So the result of the policy is that there
will be fewer children in number, but the boy/girl ratio will not be affected. 11. The probability that the ﬁrst person gets a gold coin is 3/5. The probability that the second
person gets a gold coin is
2332
3
×+×=.
4545
5
The probability that the third person gets a gold coin is
321322232213
3
××+××+××+××=,
543543545543
5
and so on. Therefore, they are all equal. 12. A Probabilistic Solution: Let n be the number of adults in the town. Let x be the number
of men in the town. Then n − x is the number of women in the town. Since the number of
married men and married women are equal, we have
x· 7
3
= (n − x) · .
9
5 This relation implies that x = (27/62)n. Therefore, the probability that a randomly selected
adult is male is (27/62)n n = 27/62. The probability that a randomly selected adult is female
is 1 − (27/62) = 35/62. Let A be the event that a randomly selected adult is married. Let M
be the event that the randomly selected adult is a man, and let W be the event that the randomly
selected adult is a woman. By the law of total probability,
P (A) = P (A  M)P (M) + P (A  W )P (W )
= 7 27 3 35
42
21
·
+·
=
=
≈ 0.677.
9 62 5 62
62
31 Therefore, 21/31st of the adults are married.
An Arithmetical Solution:
The common numerator of the two fractions is 21. Hence
21/27th of the men and 21/35th of the women are married. We ﬁnd the common numerator
because the number of married men and the number of married women are equal. This shows
that of every 27 + 35 = 62 adults, 21 + 21 = 42 are married. Hence 42/62th = 21/31st of the
adults in the town are married. Section 3.3 Law of Total Probability 43 13. The answer is clearly 0.40. This can also be computed from
(0.40)(0.75) + (0.40)(0.25) = 0.40. 14. Let A be the event that a randomly selected child is the k th born of his or her family. Let Bj
be the event that he or she is from a family with j children. Then
c P (A) = P (A  Bj )P (Bj ),
j =k where, clearly, P (A  Bj ) = 1/j . To ﬁnd P (Bj ), note that there are αi N families with j
children. Therefore, the total number of children in the world is c=0 i(αi N) of which j (N αj )
i
are from families with j children. Hence
P (Bj ) = j (N αj )
=
c
i =0 i(αi N ) j αj
.
c
i =0 iαi This shows that the desired fraction is given by
c P (A) = c P (A  Bj )P (Bj ) =
j =k
c =
j =k 15. Q(E  F ) = j =k αj
c
i =0 iαi = 1
·
j j αj
c
i =0 iαi c
j =k αj
.
c
i =0 iαi P (EF B)
P (B) Q(EF )
P (EF  B)
P (EF B)
=
=
= P (E  F B).
=
Q(F )
P (F  B)
P (F B)
P (F B)
P (B) 16. Let M , C , and F denote the events that the random student is married, is married to a student
at the same campus, and is female, respectively. We have that
1
2
P (F  M) = P (F  MC)P (C  M)+P (F  MC c )P (C c  M) = (0.40) +(0.30) = 0.333.
3
3 17. Let p(k, n) be the probability that exactly k of the ﬁrst n seeds planted in the farm germinated.
Using induction on n, we will show that p(k, n) = 1/(n − 1) for all k < n. For n = 2,
p(1, 2) = 1 = 1/(2 − 1) is true. If p(k, n − 1) = 1/(n − 2) for all k < n − 1, then, by the
law of total probability,
p(k, n) =
= k−1
n−k−1
p(k − 1, n − 1) +
p(k, n − 1)
n−1
n−1
k−1
1
n−k−1
1
1
·
+
·
=
.
n−1 n−2
n−1
n−2
n−1 This proves the induction hypothesis. 44 Chapter 3 Conditional Probability and Independence 18. Reducing the sample space, we have that the answer is 7/10. 19. 10
7
10 8
6
10 8
5
8
8
3
3
2
1
3
1
2
3
3
3
×
+
×
+
×
+
×
18
18
18
18
18
18
18
18
3
3
3
3
3
3
3
3 = 0.0383. 20. We have that
P (A  G) = P (A  GO)P (O  G) + P (A  GM)P (M  G) + P (A  GY )P (Y  G)
11131
5
=0× + × + × =
.
32343
12 21. Let E be the event that the third number falls between the ﬁrst two. Let A be the event that
the ﬁrst number is smaller than the second number. We have that
P (E  A) = P (EA)
1/6
1
=
=.
P (A)
1/2
3 Intuitively, the fact that P (A) = 1/2 and P (EA) = 1/6 should be clear (say, by symmetry).
However, we can prove these rigorously. We show that P (A) = 1/2; P (EA) = 1/6 can be
proved similarly. Let B be the event that the second number selected is smaller than the ﬁrst
number. Clearly A = B c and we only need to show that P (B) = 1/2. To do this, let Bi be
the event that the ﬁrst number drawn is i , 1 ≤ i ≤ n. Since {B1 , B2 , . . . , Bn } is a partition of
the sample space,
n P (B  Bi )P (Bi ). P (B) =
i =1 Now P (B  B1 ) = 0 because if the ﬁrst number selected is 1, the second number selected
i−1
cannot be smaller. P (B  Bi ) =
, 1 ≤ i ≤ n since if the ﬁrst number is i , the second
n−1
number must be one of 1, 2, 3, . . . , i − 1 if it is to be smaller. Thus
n n P (B) = P (B  Bi )P (Bi ) =
i =1 = i =2 i−1 1
1
·=
n−1 n
(n − 1)n n (i − 1)
i =2 1
1
(n − 1)n
1
1 + 2 + 3 + · · · + (n − 1) =
·
=.
(n − 1)n
(n − 1)n
2
2 22. Let Em be the event that Avril selects the best suitor given her strategy. Let Bi be the event
that the best suitor is the i th of Avril’s dates. By the law of total probability,
n P (Em ) =
i =1 1
P (Em  Bi )P (Bi ) =
n n P (Em  Bi ).
i =1 Section 3.3 Law of Total Probability 45 Clearly, P (Em  Bi ) = 0 for 1 ≤ i ≤ m. For i > m, if the i th suitor is the best, then Avril
chooses him if and only if among the ﬁrst i − 1 suitors Avril dates, the best is one of the ﬁrst
m. So
m
P (Em  Bi ) =
.
i−1
Therefore,
n
n
m
1
m
1
=
.
P (Em ) =
n i =m+1 i − 1
n i =m+1 i − 1
Now n n 1
≈
i−1
i =m+1 m n
1
dx = ln
.
x
m Thus m
n
ln
.
n
m
To ﬁnd the maximum of P (Em ), consider the differentiable function
P (Em ) ≈ h(x) = x
n
ln
.
n
x Since n
1
1
ln
− =0
n
x
n
implies that x = n/e, the maximum of P (Em ) is at m = [n/e], where [n/e] is the greatest
integer less than or equal to n/e. Hence Avril should dump the ﬁrst [n/e] suitors she dates
and marry the ﬁrst suitor she dates afterward who is better than all those preceding him. The
probability that with such a strategy she selects the best suitor of all n is approximately
h (x) = h n
1
1
= ln e = ≈ 0.368.
e
e
e 23. Let N be the set of nonnegative integers. The domain of f is
(g, r) ∈ N × N : 0 ≤ g ≤ N, 0 ≤ r ≤ N, 0 < g + r < 2N .
∂f
∂f
=
= 0 gives
∂g
∂r
g = r = N/2 and f (N/2, N/2) = 1/2. However, this is not the maximum value because on
the boundary of the domain of f along r = 0, we ﬁnd that
Extending the domain of f to all points (g, r) ∈ R × R, we ﬁnd that f (g, 0) = 1
N −g
1+
2
2N − g f (1, 0) = 1 3N − 2
1
≥.
2 2N − 1
2 is maximum at g = 1 and 46 Chapter 3 Conditional Probability and Independence We also ﬁnd that on the boundary along r = N ,
f (g, N ) = g
1
+1
2 g+N is maximum at g = N − 1 and
f (N − 1, N ) = 1
1 3N − 2
≥.
2 2N − 1
2 1 3N − 2
. Therefore,
2 2N − 1
there are exactly two maximums and they occur at (1, 0) and (N − 1, N). That is, the maximum
of f occurs if one urn contains one green and 0 red balls and the other one contains N − 1 green
1 3N − 2
3
and N red balls. For large N , the probability that the prisoner is freed is
≈.
2 2N − 1
4 The maximums of f along other sides of the boundary are all less than 3.4 BAYES’ FORMULA 1. (3/4)(0.40)
3
=.
(3/4)(0.40) + (1/3)(0.60)
5 2. 1(2/3)
8
=.
1(2/3) + (1/4)(1/3)
9 3. Let G and I be the events that the suspect is guilty and innocent, respectively. Let A be the
event that the suspect is lefthanded. Since {G, I } is a partition of the sample space, we can
use Bayes’ formula to calculate P (G  A), the probability that the suspect has committed the
crime in view of the new evidence.
P (G  A) = P (A  G)P (G)
(0.85)(0.65)
=
≈ 0.87.
P (A  G)P (G) + P (A  I )P (I )
(0.85)(0.65) + (0.23)(0.35) 4. Let G be the event that Susan is guilty. Let C be the event that Robert and Julie give conﬂicting
testimony. By Bayes’ formula,
P (G  C) = 5. 6. (0.25)(0.65)
P (C  G)P (G)
=
= 0.607.
c )P (Gc )
P (C  G)P (G) + P (C  G
(0.25)(0.65) + (0.30)(0.35) (0.02)(0.30)
= 0.1463.
(0.02)(0.30) + (0.05)(0.70)
6
3
6
3 11
3
11
3 1
2
1
1
+1
2
2 = 4
.
37 Section 3.4 7. Bayes’ Formula 47 (0.92)(1/5000)
= 0.084.
(0.92)(1/5000) + (1/500)(4999/5000) 8. Let A be the event that two of the three coins are dimes. Let B be the event that the coin
selected from urn I is a dime. Then
P (B  A) = 9. P (A  B)P (B)
=
P (A  B)P (B) + P (A  B c )P (B c ) 53 21
·+·
74 74
53 214
·+·
+
74 747 4
68
7
=.
83
513
·
747 (0.15)(0.25)
= 0.056.
(0.15)(0.25) + (0.85)(0.75) 10. Let R be the event that the upper side of the card selected is red. Let BB be the event that the
card with both sides black is selected. Deﬁne RR and RB similarly. By Bayes’ Formula,
P (RB  R) =
=
1 11. 5
i =0 1000 − i
100 P (R  RB)P (RB)
P (R  RB)P (RB) + P (R  RR)P (RR) + P (R  BB)P (BB)
(1/2)(1/3)
1
=.
(1/2)(1/3) + 1(1/3) + 0(1/3)
3 1
6 = 0.21.
1000
100 1
6 12. Let A be the event that the wallet originally contained a $2 bill. Let B be the event that the
bill removed is a $2 bill. The desired probability is given by
P (A  B) = P B  A P (A) P B  A P (A) + P B  Ac P (Ac )
1
1×
2
2
=
=.
3
111
1× + ×
222 13. By Bayes’ formula, the probability that the horse that comes out is from stable I equals
(20/33)(1/2)
4
=.
(20/33)(1/2) + (25/33)(1/2)
9
The probability that it is from stable II is 5/9; hence the desired probability is
20 4 25 5
205
·+
·=
= 0.69.
33 9 33 9
297 48 Chapter 3 Conditional Probability and Independence 5
2 2
·
4 14.
0· 5
4
8
4 + 1
·
4 5
3 3
1
8
4 3
2
8
4 + 2
·
4 5
2 3
2
8
4 + 3
·
4 5
1 3
3 = 0.571. 8
4 15. Let I be the event that the person is ill with the disease, N be the event that the result of the
test on the person is negative, and R denote the event that the person has the rash. We are
interested in P (I  R):
P (I  R) = P (I N  R) + P (I N c  R) = 0 + P (I N c  R).
Since {I N, I N c , I c N, I c N c } is a partition of the sample space, by Bayes’ Formula,
P (I  R) = P (I N c  R)
=
= 3.5 P (R  I N c )P (I N c )
P (R  I N)P (I N) + P (R  I N c )P (I N c ) + P (R  I c N )P (I c N ) + P (R  I c N c )P (I c N c )
(0.2)(0.30 × 0.90)
= 0.61.
0(0.30 × 0.10) + (0.2)(0.30 × 0.90) + 0(0.70 × 0.75) + (0.2)(0.70 × 0.25) INDEPENDENCE 1. No, because by independence, regardless of the number of heads that have previously occurred,
the probability of tails remains to be 1/2 on each ﬂip. 2. A and B are mutually exclusive; therefore, they are dependent. If A occurs, then the probability
that B occurs is 0 and vice versa. 3. Neither. Since the probability that a ﬁghter plane returns from a mission without mishap is
49/50 independent of other missions, the probability that a pilot who ﬂew 49 consecutive
missions without mishap making another successful ﬂight is still 49/50=0.98; neither higher
nor lower than the probability of success in any other mission. 4. P (AB) = 1/12 = (1/2)(1/6); so A and B are independent.
5. (3/8)3 (5/8)5 = 0.00503.
6. (3/4)2 = 0.5625. Section 3.5 Independence 49 7. (a) (0.725)2 = 0.526; (b) (1 − 0.725)2 = 0.076.
8. Suppose that for an event A, P (A) = 3/4. Then the probability that A occurs in two consecutive independent experiments is 9/16. So the correct odds are 9 to 7, not 9 to 1. In later
computations, Cardano, himself, had realized that the correct answer is 9 to 7 and not 9 to 1. 9. We have that
4
P (A beats B) = P (A rolls 4) = ,
6
P (B beats A) = 1 − P (A beats B) = 1 − 2
4
=,
6
6 4
P (B beats C) = P (C rolls 2) = ,
6
P (C beats B) = 1 − P (B beats C) = 1 − 4
2
=,
6
6 P (C beats D) = P (C rolls 6) + P (C rolls 2 and D rolls 1) =
P (D beats C) = 1 − P (C beats D) = 1 − 4
243
+×=,
666
6 2
4
=,
6
6 P (D beats A) = P (D rolls 5) + P (D rolls 1 and A rolls 0) = 332
4
+×=.
666
6 10. For 1 ≤ i ≤ 4, let Ai be the event of obtaining 6 on the i th toss. Chevalier de Méré had
implicitly thought that Ai ’s are mutually exclusive and so
P A1 ∪ A2 ∪ A3 ∪ A4 = 1111
1
+ + + =4× .
6666
6 Clearly Ai ’s are not mutually exclusive. The correct answers are 1 − (5/6)4 = 0.5177 and
1 − (35/36)24 = 0.4914. 11. (1 − 0.0001)64 = 0.9936.
12. In the experiment of tossing a coin, let A be the event of obtaining heads and B be the event
of obtaining tails. 13. (a) P (A ∪ B) ≥ P (A) = 1, so P (A ∪ B) = 1. Now
1 = P (A ∪ B) = P (A) + P (B) − P (AB) = 1 + P (B) − P (AB)
gives P (B) = P (AB).
(b) If P (A) = 0, then P (AB) = 0; so P (AB) = P (A)P (B) is valid. If P (A) = 1, by
part (a), P (AB) = P (B) = P (A)P (B).
2 14. P (AA) = P (A)P (A) implies that P (A) = P (A) . This gives P (A) = 0 or P (A) = 1. 50 Chapter 3 Conditional Probability and Independence 15. P (AB) = P (A)P (B) implies that P (A) = P (A)P (B). This gives P (A) 1 − P (B) = 0;
so P (A) = 0 or P (B) = 1. 16. 1 − (0.45)6 = 0.9917.
17. 1 − (0.3)(0.2)(0.1) = 0.994.
18. There are
(100 × 10 9 ) × (300 × 10 9 ) − 1 = 30 × 10 21 − 1
other stars in the universe. Provided that Aczel’s estimate is correct, the probability of no life
in orbit around any one given star in the known universe is
0.99999999999995
independently of other stars. Therefore, the probability of no life in orbit around any other
star is
(0.99999999999995)30,000,000,000,000,000,000,000 −1 .
Using Aczel’s words, “this number is indistinguishable from 0 at any level of decimal accuracy
reported by the computer.” Hence the probability that there is life in orbit around at least one
other star is 1 for all practical purposes. If there were only a billion galaxies each having 10
billion stars, still the probability of life would have been indistinguishable from 1.0 at any level
of accuracy reported by the computer. In fact, if we divide the stars into mutually exclusive
groups with each group containing billions of stars, then the argument above and Exercise 8
of Section 1.7 imply that the probability of life in orbit around many other stars is a number
practically indistinguishable from 1. 19. 1 − (0.94)15 − 15(0.94)14 (0.06) = 0.226.
20. A and B are independent if and only if P (AB) = P (A)P (B), or, equivalently, if and only if
m
M
m+w
=
·
.
M +W
M +W M +W
This implies that m/M = w/W. Therefore, A and B are independent if and only if the fraction
of the men who smoke is equal to the fraction of the women who smoke. 21. (a) By Theorem 1.6,
P A(B ∪ C) = P (AB ∪ AC) = P (AB) + P (AC) − P (ABC)
= P (A)P (B) + P (A)P (C) − P (A)P (B)P (C)
= P (A) P (B) + P (C) − P (B)P (C) = P (A)P (B ∪ C).
(b) P (A − B)C = P (AB c C) = P (A)P (B c )P (C) = P (AB c )P (C) = P (A − B)P (C). 22. 1 − (5/6)6 = 0.6651. Section 3.5
n 23. (a) 1 − (n − 1)/n .
24. Independence 51 (b) As n → ∞, this approaches 1 − (1/e) = 0.6321. 1 − (0.85)10 − 10(0.85)9 (0.15)
= 0.567.
1 − (0.85)10 25. No. In the experiment of choosing a random number from (0, 1), let A, B , and C denote the
events that the point lies in (0, 1/2), (1/4, 3/4), and (1/2, 1), respectively. 26. Denote a family with two girls and one boy by ggb, with similar representations for other
cases. The sample space is S = {ggg, bbb, ggb, gbb}. we have
P {ggg } = P {bbb} = 1/8, P {ggb} = P {gbb} = 3/8. Clearly, P (A) = 6/8 = 3/4, P (B) = 4/8 = 1/2, and P (AB) = 3/8. Since P (AB) =
P (A)P (B), the events A and B are independent. Using the same method, we can show that
for families with two children and for families with four children, A and B are not independent. 27. If p is the probability of its occurrence in one trial, 1 − (1 − p)4 = 0.59. This implies that
p = 0.2. 28. (a) 1 − (1 − p1 )(1 − p2 ) · · · (1 − pn ). (1 − p1 )(1 − p2 ) · · · (1 − pn ). (b) 29. Let Ei be the event that the switch located at i is closed. The desired probability is
P (E1 E2 E4 E6 ∪ E1 E3 E5 E6 ) = P (E1 E2 E4 E6 ) + P (E1 E3 E5 E6 ) − P (E1 E2 E3 E4 E5 E6 ) = 2p4 − p6 . 30. 5
3 2
3 3 1
3 2 = 0.329. 31. For n = 3, the probabilities of the given events, respectively, are
3
2 1
2 2 1
2 1
2 2 1
1
+
2
2 3 = 1
,
2 and
3
1 + 3
2 1
2 2 3
1
=.
2
4 The probability of their joint occurrence is
3
2 1
2 2 1
3
13
= =·.
2
8
24 So the given events are independent. For n = 4, similar calculations show that the given
events are not independent. 52 Chapter 3 Conditional Probability and Independence 1
2 n
k 32. (a) 1 − (1/2)n . (b) n . (c) Let An be the event of getting n heads in the ﬁrst n ﬂips. We have
A1 ⊇ A2 ⊇ A3 ⊇ · · · ⊇ An ⊇ An+1 ⊇ · · · .
The event of getting heads in all of the ﬂips indeﬁnitely is
of probability function (Theorem 1.8), its probability is
∞ An = lim P (An ) = lim P n→∞ n=1 n→∞ ∞
n=1 1
2 An . By the continuity property n = 0. 33. Let Ai be the event that the sixth sum obtained is i , i = 2, 3, . . . , 12. Let B be the event that
the sixth sum obtained is not a repetition. By the law of total probability,
12 P (B) = P (B  Ai )P (Ai ).
i =2 Note that in this sum, the terms for i = 2 and i = 12 are equal. This is true also for the terms
for i = 3 and 11, for the terms for i = 4 and 10, for the terms for i = 5 and 9, and for the
terms for i = 6 and 8. So
6 P (B) = 2 P (B  Ai )P (Ai ) + P (B  A7 )P (A7 )
i =2 =2 35
36
+ 5 31
36 34
1
+
36
36
5 5
36 + 5 30
36 33
2
+
36
36
5 5 32
3
+
36
36 5 4
36 6
= 0.5614.
36 34. (a) Let E be the event that Dr. May’s suitcase does not reach his destination with him. We
have
P (E) = (0.04) + (0.96)(0.05) + (0.96)(0.95)(0.05) + (0.96)(0.95)(0.95)(0.04) = 0.168,
or simply, P (E) = 1 − (0.96)(0.95)(0.96) = 0.168.
(b) Let D be the event that the suitcase is lost in Da Vinci airport in Rome. Then, by Bayes’
formula,
P (D)
(0.96)(0.05)
P (D  E) =
=
= 0.286.
P (E)
0.168 35. Let E be the event of obtaining heads on the coin before an ace from the cards. Let H , T , A,
and N denote the events of heads, tails, ace, and not ace in the ﬁrst experiment, respectively.
We use two different techniques to solve this problem. Section 3.5 Independence 53 Technique 1: By the law of total probability,
P (E) = P (E  H )P (H ) + P (E  T )P (T ) = 1 · 1
1
+ P (E  T ) · ,
2
2 where
P (E  T ) = P (E  T A)P (A  T ) + P (E  T N )P (N  T ) = 0 ·
Thus
P (E) = 1
12
+ P (E) · .
13
13 12 1
1
+ P (E)
,
2
13 2 which gives P (E) = 13/14.
Technique 2: We have that
P (E) = P (E  H A)P (H A)+P (E  T A)P (T A)+P (E  H N )P (H N )+P (E  T N )P (T N).
Thus 1
1
1
1
1 12
1 12
×
+0× ×
+1× ×
+ P (E) × × .
2 13
2 13
2 13
2 13
This gives P (E) = 13/14.
P (E) = 1 × 36. Let P (A) = p and P (B) = q . Let An be the event that none of A and B occurs in the ﬁrst
n − 1 trials and the outcome of the nth experiment is A. The desired probability is
∞ ∞ An = P
n=1 ∞ P (An ) =
n=1 (1 − p − q)n−1 p = n=1 p
p
=
.
1 − (1 − p − q)
p+q 37. The probability of sum 5 is 1/9 and the probability of sum 7 is 1/6. Therefore, by the result of
Exercise 36, the desired probability is 1/9
= 2/5.
1/6 + 1/9 38. Let A be the event that one of them is red and the other one is blue. Let RB represent the
event that the ball drawn from urn I is red and the ball drawn form urn II is blue, with similar
representations for RR , BB , and BR . We have that
P (A) = P (A  RB)P (RB) + P (A  RR)P (RR) + P (A  BB)P (BB) + P (A  BR)P (BR) = 9
1 5
1
14
2 = 0.495. 95
·
+
10 6 8
1 6
1
14
2 91
·
+
10 6 10 4
1
1
14
2 15
·
+
10 6 9
1 5
1
14
2 11
·
10 6 54 Chapter 3 Conditional Probability and Independence 39. For convenience, let p0 = 0; the desired probability is
n 1− n (1 − pi ) −
i =1 (1 − p1 )(1 − p2 ) · · · (1 − pi −1 )pi (1 − pi +1 ) · · · (1 − pn ).
i =1 40. Let p be the probability that a randomly selected person was born on one of the ﬁrst 365 days;
then 365p + (p/4) = 1 implies that p = 4/1461. Let E be the event that exactly four people
of this group have the same birthday and that all the others have different birthdays. E is the
union of the following three mutually exclusive events:
F : Exactly four people of this group have the same birthday, all the others have different
birthdays, and none of the birthdays is on the 366th day.
G: Exactly four people of this group have the same birthday, all the others have different
birthdays, and exactly one has his/her birthday on the 366th day.
H : Exactly four people of this group have their birthday on the 366th day and all the others
have different birthdays.
We have that
P (E) = P (F ) + P (G) + P (H )
365
1 = 30
4 4
1461 4 · + 30
1 1
365
·
1461
1 + 30
4 1
1461 4 · 364
4
26!
26
1461
29
4 4
1461 365
4
26!
26
1461 4 26 · 26 364
4
25!
25
1461 25 = 0.00020997237. If we were allowed to ignore the effect of the leap year, the solution would have been as
follows.
365 30
1 4 364
1 26
·
26!
= 0.00021029.
1
1
26
365
365 41. Let Ei be the event that the switch located at i is closed. We want to calculate the probability of
E2 E4 ∪ E1 E5 ∪ E2 E3 E5 ∪ E1 E3 E4 . Using the rule to calculate the probability of the union of
several events (the inclusionexclusion principle) we get that the answer is 2p 2 +2p3 −5p4 +p5 . 42. Let E be the event that A will answer correctly to his or her ﬁrst question. Let F and G be
the corresponding events for B and C , respectively. Clearly,
P (ABC) = P (ABC  EF G)P (EF G) + P (ABC  E c F G)P (E c F G)
+ P (ABC  E c F c )P (E c F c ). (5) Now
P (ABC  EF G) = P (ABC), (6) Section 3.5 Independence 55
and
P (ABC  E c F c ) = 1. (7) To calculate P (ABC  E c F G), note that since A has already lost, the game continues between
B and C . Let BC be the event that B loses and C wins. Then
P (ABC  E c F G) = P (BC). (8) Let F2 be the event that B answers the second question correctly; then
C
C
P (BC) = P (BC  F2 )P (F2 ) + P (BC  F2 )P (F2 ). (9) To ﬁnd P (BC  F2 ), note that this quantity is the probability that B loses to C given that B
did not lose the ﬁrst play. So, by independence, this is the probability that B loses to C given
that C plays ﬁrst. Now by symmetry, this quantity is the same as C losing to B if B plays ﬁrst.
Thus it is equal to P (CB), and hence (9) gives
P (BC) = P (CB) · p + 1 · (1 − p);
noting that P (CB) = 1 − P (BC), this gives
P (BC) = 1
.
1+p Therefore, by (8),
P (ABC  E c F G) = 1
.
1+p substituting this, (8), and (7) in (5), yields
P (ABC) = P (ABC) · p 3 + 1
(1 − p)p 2 + (1 − p)2 .
1+p Solving this for P (ABC), we obtain
P (ABC) = 1
.
(1 + p)(1 + p + p 2 ) Now we ﬁnd P (BCA) and P (CAB).
P (BCA) = P (BCA  E)P (E) + P (BCA  E c )P (E c )
p
= P (ABC) · p + 0 · (1 − p) =
,
(1 + p)(1 + p + p 2 )
P (CAB) = P (CAB  E)P (E) + P (CAB  E c )P (E c )
= P (BCA) · p + 0 · (1 − p) = p2
.
(1 + p)(1 + p + p 2 ) 56 Chapter 3 Conditional Probability and Independence 43. We have that
11
3
1
· +0· = .
24
4
8 P (H1 ) = P (H1  H )P (H ) + P (H1  H c )P (H c ) = cc
Similarly, P (H2 ) = 1/8. To calculate P (H1 H2 ), the probability that none of her sons is
hemophiliac, we condition on H again.
cc
cc
cc
P (H1 H2 ) = P (H1 H2  H )P (H ) + P (H1 H2  H c )P (H c ).
cc
cc
Clearly, P (H1 H2  H c ) = 1. To ﬁnd P (H1 H2  H ), we use the fact that H1 and H2 are
conditionally independent given H .
cc
c
c
P (H1 H2  H ) = P (H1  H )P (H2  H ) = Thus
cc
P (H1 H2 ) = 1
11
·=.
22
4 3
13
11
· +1· = .
44
4
16 44. The only quantity not calculated in the hint is P (Ui  Rm ). By Bayes’ Formula,
P (Ui  Rm ) = P (Rm  Ui )P (Ui )
n = i
n
n P (Rm  Uk )P (Uk )
k =0 3.6 k =0 m k
n 1
n+1
m 1
n+1 = i
n
n
k =0 m .
k
n m APPLICATIONS OF PROBABILITY TO GENETICS 1. Clearly, Kim and Dan both have genotype OO . With a genotype other than AO for John, it is
impossible for Dan to have blood type O . Therefore, the probability is 1 that John’s genotype
is AO . 2. The answer is k
k(k + 1)
+k =
.
2
2 3. The genotype of the parent with wrinkled shape is necessarily rr . The genotype of the other
parent is either Rr or RR . But, RR will never produce wrinkled offspring. So it must be Rr .
Therefore, the parents are rr and Rr . 4. Let A represent the dominant allele for free earlobes and a represent the recessive allele for
attached earlobes. Let B represent the dominant allele for freckles and b represent the recessive
allele for no freckles. Since Dan has attached earlobes and no freckles, Kim and John both
must be AaBb. This implies that Kim and John’s next child is AA with probability 1/4, Aa Section 3.6 Applications of Probability to Genetics 57 with probability 1/2, and aa with probability 1/4. Therefore, the next child has free earlobes
with probability 3/4. Similarly, the next child is BB with probability 1/4, Bb with probability
1/2, and bb with probability 1/4. Hence he or she will have no freckles with probability 1/4.
By independence, the desired probability is (3/4)(1/4) = 3/16. 5. If the genes are not linked, 25% of the offspring are expected to be BbV v , 25% are expected
to be bbvv , 25% are expected to be Bbvv , and 25% are expected to be bbV v . The observed
data shows that the genes are linked. 6. Clearly, John’s genotype is either Dd or dd . Let E be the event that it is dd . Then E c is the
event that John’s genotype is Dd . Let F be the event that Dan is deaf. That is, his genotype
is dd . We use Bayes’ theorem to calculate the desired probability.
P (E  F ) = P (F  E)P (E)
P (F  E)P (E) + P (F  E c )P (E c ) = 1 · (0.01)
= 0.0198.
1 · (0.01) + (1/2)(0.99) Therefore, the probability is 0.0198 that John is also deaf. 7. A person who has cystic ﬁbrosis carries two mutant alleles. Applying the HardyWeinberg
law, we have that q 2 = 0.0529, or q = 0.23. Therefore, p = 0.77. Since q 2 + 2pq =
1 − p2 = 0.4071, the percentage of the people who carry at least one mutant allele of the
disease is 40.71%. 8. Dan inherits all of his sexlinked genes from his mother. Therefore, John being normal has no
effect on whether or not Dan has hemophilia or not. Let E be the event that Kim is H h. Then
E c is the event that Kim is H H . Let F be the event that Dan has hemophilia. By the law of
total probability,
P (F ) = P (F  E)P (E) + P (F  E c )P (E c )
= (1/2) 2(0.98)(0.02) + 0 · (0.98)(0.98) = 0.0196. 9. Dan has inherited all of his sexlinked genes from his mother. Let E1 be the event that Kim is
CC , E2 be the event that she is Cc, and E3 be the event that she is cc. Let F be the event that
Dan is colorblind. By Bayes’ formula, the desired probability is
P (E3  F ) =
= P (F  E3 )P (E3 )
P (F  E1 )P (E1 ) + P (F  E2 )P (E2 ) + P (F  E3 )P (E3 )
1 · (0.17)(0.17)
= 0.17.
0 · (0.83)(0.83) + (1/2) 2(0.83)(0.17) + 1 · (0.17)(0.17) 10. Since Ann is hh and John is hemophiliac, Kim is either H h or hh. Let E be the event that she
is H h. Then E c is the event that she is hh. Let F be the event that Ann has hemophilia. By 58 Chapter 3 Conditional Probability and Independence Bayes’ formula, the desired probability is
P (E  F ) =
= P (F  E)P (E)
P (F  E)P (E) + P (F  E c )P (E c )
(1/2) 2(0.98)(0.02)
(1/2) 2(0.98)(0.02) + 1 · (0.02)(0.02) = 0.98. 11. Clearly, both parents of Mr. J must be Cc. Since Mr. J has survived to adulthood, he is not cc.
Therefore, he is either CC or Cc. We have
P (he is CC  he is CC or Cc) = P (he is CC)
1/4
1
=
=.
P (he is CC or Cc)
3/4
3 P (he is Cc  he is CC or Cc) = 2
.
3 Mr. J’s wife is either CC with probability 1 − p or Cc with probability p . Let E be the event
that Mr. J is Cc, F be the event that his wife is Cc, and H be the event that their next child is
cc. The desired probability is
P (H ) = P (H EF ) = P (H  EF )P (EF )
= P (H  EF )P (E)P (F ) = 12
p
· ·p = .
43
6 12. Let E1 be the event that both parents are of genotype AA, let E2 be the event that one parent
is of genotype Aa and the other of genotype AA, and let E3 be the event that both parents are
of genotype Aa . Let F be the event that the man is of genotype AA. By Bayes’ formula,
P (E1  F ) = P (F  E1 )P (E1 )
P (F  E1 )P (E1 ) + P (F  E2 )P (E2 ) + P (F  E3 )P (E3 ) = p2
1 · p4
=
= p2 .
1 · p 4 + (1/2) · 4p 3 q + (1/4) · 4p 2 q 2
(p + q)2 Similarly, P (E2  F ) = 2pq and P (E3  F ) = q 2 . Let B be the event that the brother is AA.
We have
P (B  F ) = P (B  F E1 )P (E1  F ) + P (B  F E2 )P (E2  F ) + P (B  F E3 )P (E3  F )
= P (B  E1 )P (E1  F ) + P (B  E2 )P (E2  F ) + P (B  E3 )P (E3  F )
= 1 · p2 + 1
(1 + p)2
1
(2p + q)2
· 2pq + · q 2 =
=
.
2
4
4
4 Chapter 3 Review Problems 59 REVIEW PROBLEMS FOR CHAPTER 3
1. 12 13 13 12
26
·
+
·
=
= 0.347.
30 30 30 30
75 2. 1 − (0.97)6 = 0.167.
3. (0.48)(0.30) + (0.67)(0.53) + (0.89)(0.17) = 0.65.
4. (0.5)(0.05) + (0.7)(0.02) + (0.8)(0.035) = 0.067.
5. (a) (0.95)(0.97)(0.85) = 0.783; (b) 1 − (0.05)(0.03)(0.05) = 0.999775;
(c) 1 − (0.95)(0.97)(0.85) = 0.217; (d) (0.05)(0.03)(0.15) = 0.000225. 6. 103/132 = 0.780.
7. (0.08)(0.20)
= 0.0796.
(0.2)(0.3) + (0.25)(0.5) + (0.08)(0.20) 8. 1 − 26
6 39
6 = 0.929. 9. 1/6.
1− 10. 5
6 10 − 10 5
1−
6 5
6 9 1
6 = 0.615. 10 24
·
8
77
= 0.35.
11.
=
23
24 53
·+·
77 77 12. Let A be the event of “head on the coin.” Let B be the event of “tail on the coin and 1 or 2 on
the die.” Then A and B are mutually exclusive, and by the result of Exercise 36 of Section 3.5,
1/2
3
the answer is
=.
(1/2) + (1/6)
4 13. The probability that the number of 1’s minus the number of 2’s will be 3 is
P (four 1’s and one 2) + P (three 1’s and no 2’s)
= 6
4 1
6 4 2
1 1
6 4
6
+
6
3 1
6 3 4
6 3 = 0.03. 60 Chapter 3 Conditional Probability and Independence 14. The probability that the ﬁrst urn was selected in the ﬁrst place is
20
·
45
20 1
·+
45 2 1
10
2
.
=
19
10 1
·
25 2 The desired probability is
20 10 10 9
·
+
·
≈ 0.42.
45 19 25 19 15. Let B be the event that the ball removed from the third urn is blue. Let BR be the event that
the ball drawn from the ﬁrst urn is blue and the ball drawn from the second urn is red. Deﬁne
BB , RB , and RR similarly. We have that
P (B) = P (B  BB)P (BB) + P (B  RB)P (RB) + P (B  RR)P (RR) + P (B  BR)P (BR)
4 15
5 95
6 91
5 11
38
=
·
+
·
+
·
+
·
=
= 0.36.
14 10 6 14 10 6 14 10 6 14 10 6
105 16. Let E be the event that Lorna guesses correctly. Let R be the event that a red hat is placed
on Lorna’s head, and B be the event that a blue hat is placed on her head. By the law of total
probability,
P (E) = P (E  R)P (R) + P (E  B)P (B)
1
1
1
= α · + (1 − α) · =
2
2
2
This shows that Lorna’s chances are 50% to guess correctly no matter what the value of α is.
This should be intuitively clear. 17. Let F be the event that the child is found; E be the event that he is lost in the east wing, and
W be the event that he is lost in the west wing. We have
P (F ) = P (F  E)P (E) + P (F  W )P (W )
= 1 − (0.6)3 (0.75) + 1 − (0.6)2 (0.25) = 0.748. 18. The answer is that it is the same either way. Let W be the event that they win one of the nights
to themselves. Let F be the event that they win Friday night to themselves. Then
P (W ) = P (W  F )P (F ) + P (W  F c )P (F c ) = 1 · 2
1 12
+·=.
3 23
3 19. Let A be the event that Kevin is prepared. We have that
P (R  B c S c ) =
= P (RB c S c  A)P (A) + P (RB c S c  Ac )P (Ac )
P (RB c S c )
=
P (B c S c )
P (B c S c  A)P (A) + P (B c S c  Ac )P (Ac )
(0.85)(0.15)2 (0.85) + (0.20)(0.80)2 (0.15)
= 0.308.
(0.15)2 (0.85) + (0.80)2 (0.15) Chapter 3 Review Problems 61 Note that
P (R) = P (R  A)P (A) + P (R  Ac )P (Ac ) = (0.85)(0.85) + (0.20)(0.15) = 0.7525.
Since P (R  B c S c ) = P (R), the events R , B , and S are not independent. However, it must be
clear that R , B , and S are conditionally independent given that Kevin is prepared and they are
conditionally independent given that Kevin is unprepared. To explain this, suppose that we are
given that, for example, Smith and Brown both failed a student. This information will increase
the probability that the student was unprepared. Therefore, it increases the probability that
Rose will also fails the student. However, if we know that the student was unprepared, the
knowledge that Smith and Brown failed the student does not affect the probability that Rose
will also fail the student. 20. (a) Let A be the event that Adam has at least one king; B be the event that he has at least
two kings. We have
P (B  A) = P (Adam has at least two kings)
P (AB)
=
P (A)
P (Adam has at least one king) 1−
= 48
13
52
13 − 1− 48 4
12 1
52
13
48
13
52
13 = 0.3696. (b) Let A be the event that Adam has the king of diamonds. Let B be the event that he has
the king of diamonds and at least one other king. Then
48
11
P (B  A) = P (BA)
=
P (A) 3
48 3
48
+
+
1
10 2
9
52
13
51
12
52
13 3
3
= 0.5612. Knowing that Adam has the king of diamonds reduces the sample space to a size considerably
smaller than the case in which we are given that he has a king. This is why the answer to 62 Chapter 3 Conditional Probability and Independence part (b) is larger than the answer to part (a). If one is not convinced of this, he or she should
solve the problem in a simpler case. For example, a case in which there are four cards, say,
king of diamonds, king of hearts, jack of clubs, and eight of spade. If two cards are drawn,
the reduced sample space in the case Adam announces that he has a king is
{Kd Kh , Kd Jc , Kd 8s , Kh Jc , Kh 8s },
while the reduced sample space in the case Adam announces that he has the king of diamonds
is
{Kd Kh , Kd Jc , Kd 8s }.
In the ﬁrst case, the probability of more kings is 1/5; in the second case the probability of
more kings is 1/3. Chapter 4 D istribution F unctions and
Discrete R andom Variables
4.2 DISTRIBUTION FUNCTIONS 1. The set of possible values of X is {0, 1, 2, 3, 4, 5}. The probabilities associated with these
values are
x
P (X = x) 0
6/36 1
10/36 2
8/36 3
6/36 4
4/36 5
2/36 2. The set of possible values of X is {−6, −2, −1, 2, 3, 4}. The probabilities associated with
these values are P (X = −6) = P (X = 2) = P (X = 4) = P (X = −2) = P (X = −1) = P (X = 3) = 5
2
15
2 = 0.095, 5
1 5
1
15
2 = 0.238. 3. The set of possible values of X is {0, 1, 2 . . . , N }. Assuming that people have the disease
independent of each other,
P (X = i) = (1 − p)i −1 p 1≤i≤N (1 − p)N i = 0. 4. Let X be the length of the side of a randomly chosen plastic die manufactured by the factory,
then
P (X 3 > 1.424) = P (X > 1.125) = 1.25 − 1.125
1
=.
1.25 − 1
2 64 Chapter 4 Distribution Functions and Discrete Random Variables 5. P (X < 1) = F (1−) = 1/2.
P (X = 1) = F (1) − F (1−) = 1/6.
P (1 ≤ X < 2) = F (2−) − F (1−) = 1/4.
P (X > 1/2) = 1 − F (1/2) = 1 − 1/2 = 1/2.
P (X = 3/2) = 0.
P (1 < X ≤ 6) = F (6) − F (1) = 1 − 2/3 = 1/3. 6. Let F be the distribution function of X. Then ⎧
⎪0
⎪
⎪
⎪
⎪
⎪1/8
⎪
⎪
⎨
F (t) = 1/2
⎪
⎪
⎪
⎪7/8
⎪
⎪
⎪
⎪
⎩
1 t <0
0≤t <1
1≤t <2
2≤t <3
t ≥ 3. 7. Note that X is neither continuous nor discrete. The answers are
(a) F (6−) = 1 implies that k(−36 + 72 − 3) = 1; so k = 1/33.
(b) F (4) − F (2) = 29/33 − 4/33 = 25/33.
(c) 1 − F (3) = 1 − (24/33) = 9/33.
(d) 9
29
−
F (4) − F (3−)
5
33 33
=
P (X ≤ 4  X ≥ 3) =
=.
1 − F (3−)
6
9
1−
33 8. F (Q0.5 ) = 1/2 implies that 1 + e−x = 2. The only solution of this question is x = 0. So x = 0 is the median of F . Similarly, F (Q0.25 ) = 1/4 implies that 1 + e−x = 4, the solution
of which is x = − ln 3. F (Q0.75 ) = 3/4 implies that 1 + e−x = 4/3, the solution of which is
x = ln 3. So − ln 3 and ln 3 are the ﬁrst and the third quartiles of F , respectively. Therefore,
50% of the years the rate at which the price of oil per gallon changes is negative or zero, 25%
of the years the rate is − ln 3 ≈ −1.0986 or less, and 75% of the years the rate is ln 3 ≈ 1.0986
or less. 9. (a)
P (X  ≤ t) = P (−t ≤ X ≤ t) = P (X ≤ t) − P (X < −t)
= F (t) − 1 − P (X ≥ −t) = F (t) − 1 − P (x ≤ t) = 2F (t) − 1.
(b) Using part (a), we have
P (X  > t) = 1 − P (X ≤ t) = 1 − 2F (t) − 1 = 2 1 − F (t) . Section 4.2 Distribution Functions 65 (c)
P (X = t) = 1 + P (X = t) − 1 = P (X ≤ t) + P (X > t) + P (X = t) − 1
= P (X ≤ t) + P (X ≥ t) − 1 = P (X ≤ t) + P (X ≤ −t) − 1
= F (t) + F (−t) − 1. 10. F is a distribution function because F (−∞) = 0, F (∞) = 1, F is right continuous, and
F (t) = 1 −t
e > 0 implies that F is nondecreasing.
π 11. F is a distribution function because F (−∞) = 0, F (∞) = 1, F is right continuous, and
F (t) = 1
> 0 implies that it is nondecreasing.
(1 + t)2 12. Clearly, F is right continuous. On t < 0 and on t ≥ 0, it is increasing, limt →∞ F (t) = 1, and limt →−∞ F (t) = 0. It looks like F satisﬁes all of the conditions necessary to make
it a distribution function. However, F (0−) = 1/2 > F (0+) = 1/4 shows that F is not
nondecreasing. Therefore, F is not a probability distribution function. 13. Let the departure time of the last ﬂight before the passenger arrives be 0. Then Y , the arrival
time of the passenger is a random number from (0, 45). The waiting time is X = 45 − Y . We
have that for 0 ≤ t ≤ 45,
P (X ≤ t) = P (45 − Y ≤ t) = P (Y ≥ 45 − t) = t
45 − (45 − t)
=.
45
45 So F , the distribution function of X is ⎧
⎪0
t <0
⎪
⎨
F (t) = t/45 0 ≤ t < 45
⎪
⎪
⎩
1
t ≥ 45. 14. Let X be the ﬁrst twodigit number selected from the set {00, 01, 02, . . . , 99} which is between
4 and 18. Since for i = 4, 5, . . . , 18, P (X = i  4 ≤ X ≤ 18) = 1/100
1
P (X = i)
=
=,
P (4 ≤ X ≤ 18)
15/100
15 we have that X is chosen randomly from the set {4, 5, . . . , 18}. 15. Let X be the minimum of the three numbers,
P (X < 5) = 1 − P (X ≥ 5) = 1 − 36
3
40
3 = 0.277. 66 Chapter 4 Distribution Functions and Discrete Random Variables 16.
P (X2 − 5X + 6 > 0) = P (X − 2)(X − 3) > 0 = P (X < 2) + P (X > 3) = 17.
F (t) = ⎧
⎪0
⎪
⎪
⎪
⎨ t
⎪1 − t
⎪
⎪
⎪
⎩1 2−0
2
+0 = .
3−0
3 t <0
0 ≤ t < 1/2
t ≥ 1/2. 18. The distribution function of X is F (t) = 0 if t < 1; F (t) = 1 − (89/90)n if n ≤ t < n + 1,
n ≥ 1. Since 89
90
26 is the ﬁrst quartile. Since 89
90 26 89 62
89
= 0.4998 < 0.5 < 1 −
90
90
63 is the median of X . Similarly, 63 F (26−) = 1 − 25 = 0.244 < 0.25 < 1 − F (63−) = 1 − 89 124
89
= 0.7498 < 0.75 < 1 −
90
90
implies that 125 is the third quartile of X.
F (125−) = 1 − 19.
G(t) = 4.3 ⎧
⎨F (t) t <5 ⎩1 t ≥ 5. DISCRETE RANDOM VARIABLES 1. F , the distribution functions of X is given by F (x) = ⎧
⎪0
⎪
⎪
⎪
⎪
⎪1/15
⎪
⎪
⎪
⎪
⎪
⎨3/15 if x < 1
if 1 ≤ x < 2
if 2 ≤ x < 3 ⎪6/15
if 3 ≤ x < 4
⎪
⎪
⎪
⎪
⎪
⎪10/15 if 4 ≤ x < 5
⎪
⎪
⎪
⎪
⎩
1
if x ≥ 5. = 0.252 = F (26), = 0.505 = F (63), 125 = 0.753 = F (125), Section 4.3 Discrete Random Variables 67 2. p, the probability mass function of X, is given by
x
p(x) 1
11/36 2
9/36 3
7/36 4
5/36 5
3/36 6
1/36 F , the probability distribution function of X, is given by
⎧
⎪0
if x < 1
⎪
⎪
⎪
⎪
⎪11/36 if 1 ≤ x < 2
⎪
⎪
⎪
⎪
⎪
⎪20/36 if 2 ≤ x < 3
⎪
⎪
⎨
F (x) = 27/36 if 3 ≤ x < 4
⎪
⎪
⎪
⎪32/36 if 4 ≤ x < 5
⎪
⎪
⎪
⎪
⎪
⎪35/36 if 5 ≤ x < 6
⎪
⎪
⎪
⎪
⎩1
if x ≥ 6. 3. The possible values of X are 2, 3, . . . , 12. The sample space of this experiment consists of 36
equally likely outcomes. Hence the probability of any of them is 1/36. Thus
p(2) = P (X = 2) = P (1, 1) = 1/36, p(3) = P (X = 3) = P (1, 2), (2, 1) p(4) = P (X = 4) = P (1, 3), (2, 2), (3, 1) = 3/36. 8
5/36 11
2/36 = 2/36, Similarly,
i
p(i) 5
4/36 6
5/36 7
6/36 9
4/36 10
3/36 12
1/36 4. Let p be the probability mass function of X. We have
−2
1/2 x
p(x) 2
1/10 4
13/45 6
1/9 5. Let p be the probability mass function of X and q be the probability mass function of Y . We
have
p(i) = 9
10 q(j ) = P (Y = j ) = P X = 6. Mode of p = 1; mode of q = 1. i −1 1
,
10 9
j −1
=
2
10 i = 1, 2, . . . .
(j −3)/2 1
,
10 j = 3, 5, 7, . . . . 68 Chapter 4 7. (a)
(b) Distribution Functions and Discrete Random Variables 5
k =1 k(−1)2 + k + 4k + 9k = 1 ⇒ k = 1/15.
∞ (c)
(d) (e) kx = 1 ⇒ k = 1/15. 1/9
= 8.
1 − (1/9)
x =1
1
2
k(1 + 2 + · · · + n) = 1 ⇒ k =
=
.
n(n + 1)
n(n + 1) /2
k 1
9 x =1⇒k= 1 ∞
x
x =1 (1/9) k(12 + 22 + · · · + n2 ) = 1 ⇒ k = =1 6
.
n(n + 1)(2n + 1) 8. Let p be the probability mass function of X; then
18
i p(i) = P (X = i) = 28
12 − i
46
12 i = 0, 1, 2, . . . , 12. 9. For x < 0, F (x) = 0. If x ≥ 0, for some nonnegative integer n, n ≤ x < n + 1, and we have
that
n F (x) =
i =0 = 31
44 i = 3
1
1
1+
+
4
4
4 3 1 − (1/4)n+1
1
·
=1−
4
1 − (1/4)
4 n+1 2 + ··· + 1
4 n . Thus
F (x) = 0 if x < 0 1 − (1/4)n+1 if n ≤ x < n + 1, n = 0, 1, 2, . . . . 10. Let p be the probability mass function of X and F be its distribution function. We have
p(i) = 5
6 i −1 1
,
6 i = 1, 2, 3, . . . . F (x) = 0 for x < 1. If x ≥ 1, for some positive integer n, n ≤ x < n + 1, and we have that
n F (x) =
i =1 = 5
6 i −1 1
1
5
5
= 1+
+
6
6
6
6 1 1 − (5/6)n
5
·
=1−
6 1 − (5/6)
6 n . 2 + ··· + 5
6 n−1 Section 4.3 Hence
F (x) = ⎧
⎪0
⎨ Discrete Random Variables 69 if x < 1 ⎪1 − 5
⎩
6 n if n ≤ x < n + 1, n = 1, 2, 3, . . . . 11. The set of possible values of X is {2, 3, 4, . . . }. For n ≥ 2, X = n if and only if either all of
the ﬁrst n − 1 bits generated are 0 and the nth bit generated is 1, or all of the ﬁrst n − 1 bits
generated are 1 and the nth bit generated is 0. Therefore, by independence,
P (X = n) = 1
2 n−1 · 1
1
+
2
2 n−1 · 1
1
=
2
2 n−1 n ≥ 2. , 12. The event Z > i occurs if and only if Liz has not played with Bob since i Sundays ago, and
the earliest she will play with him is next Sunday. Now the probability is i/k that Liz will
play with Bob if last time they played was i Sundays ago; hence
i
P (Z > i) = 1 − ,
k i = 1, 2, . . . , k − 1. Let p be the probability mass function of Z . Then, using this fact for 1 ≤ i ≤ k , we obtain
p(i) = P (Z = i) = P (Z > i − 1) − P (Z > i) = 1 − i
1
i−1
− 1−
=.
k
k
k 13. The possible values of X are 0, 1, 2, 3, 4, and 5. For i , 0 ≤ i ≤ 5, P (X = i) = 5
i 6 Pi · 9 P5−i · 10!
. 15! The numerical values of these probabilities are as follows.
i
P (X = i) 0
42/1001 1
252/1001 2
420/1001 3
240/1001 4
45/1001 14. For i = 0, 1, 2, and 3, we have P (X = i) = 10
i 10 − i 6−2i
2
6 − 2i
.
20
6 The numerical values of these probabilities are as follows.
i
p(i) 0
112/323 1
168/323 2
42/323 3
1/323 5
2/1001 70 Chapter 4 Distribution Functions and Discrete Random Variables 15. Clearly,
6 P (X > n) = P Ei ·
i =1 To calculate P E1 ∪ E2 ∪ · · · ∪ E6 , we use the inclusionexclusion principle. To do so, we
must calculate the probabilities of all possible intersections of the events from E1 , . . . , E6 ,
add the probabilities that are obtained by intersecting an odd number of events, and subtract
all the probabilities that are obtained by intersecting an even number of events. Clearly, there
6
6
6
are
terms of the form P (Ei ),
terms of the form P (Ei Ej ),
terms of the form
1
2
3
P (Ei Ej Ek ), and so on. Now for all i , P (Ei ) = (5/6)n ; for all i and j , P (Ei Ej ) = (4/6)n ;
for all i , j , and k , P (Ei Ej Ek ) = (3/6)n ; and so on. Thus
P (X > n) = P (E1 ∪ E2 ∪ · · · ∪ E6 )
= 6
1 =6 5
6 5
6
n n 6
2 − − 15 4
6 4
6
n n + 20 +
3
6 6
3
n 3
6 n − 2
6 − 15 n 6
4 2
6 n +6 1
6 6
5 n + 1
6 n . Let p be the probability mass function of X. The set of all possible values of X is {6, 7, 8, . . . },
and
p(n) = P (X = n) = P (X > n − 1) − P (X > n)
= 5
6 n−1 −5 4
6 n−1 + 10 3
6 n−1 − 10 2
6 n−1 +5 1
6 n−1 , n ≥ 6. 16. Put the students in some random order. Suppose that the ﬁrst two students form the ﬁrst team,
the third and fourth students form the second team, the ﬁfth and sixth students form the third
team, and so on. Let F stand for “female” and M stand for “male.” Since our only concern
is gender of the students, the total number of ways we can form 13 teams, each consisting of
two students, is equal to the number of distinguishable permutations of a sequence of 23 M ’s
26!
26
and three F ’s. By Theorem 2.4, this number is
=
. The set of possible values of
23! 3!
3
the random variable X is {2, 4, . . . , 26}. To calculate the probabilities associated with these
values, note that for k = 1, 2, . . . , 13, X = 2k if and only if one of the following events
occurs:
A: One of the ﬁrst k − 1 teams is a femalefemale team, the k th team is either a malefemale
or a femalemale team, and the remaining teams are all malemale teams. B: The ﬁrst k − 1 teams are all malemale teams, and the k th team is either a malefemale
team or a femalemale team. Section 4.4 Expectations of Discrete Random Variables 71 To ﬁnd P (A), note that for A to occur, there are k − 1 possibilities for one of the ﬁrst k − 1 teams
to be a femalefemale team, two possibilities for the k th team (malefemale and femalemale),
and one possibility for the remaining teams to be all malemale teams. Therefore,
P (A) = 2(k − 1)
.
26
3 To ﬁnd P (B), note that for B to occur, there is one possibility for the ﬁrst k − 1 teams to
be all malemale, and two possibilities for the k th team: malefemale and femalemale. The
number of possibilities for the remaining 13 − k teams is equal to the number of distinguishable
26 − 2k)!
permutations of two F ’s and (26 − 2k) − 2 M ’s, which, by Theorem 2.4, is
=
2! (26 − 2k − 2)!
26 − 2k
. Therefore,
2
26 − 2k
2
2
P (B) =
.
26
3
Hence, for 1 ≤ k ≤ 13,
2(k − 1) + 2
P (X = 2k) = P (A) + P (B) = 4.4 26 − 2k
2 26
3 = 1
12
1
k − k+ .
650
26
4 EXPECTATIONS OF DISCRETE RANDOM VARIABLES 1. Yes, of course there is a fallacy in Dickens’ argument. If, in England, at that time there were
exactly two train accidents each month, then Dickens would have been right. Usually, for all
n > 0 and for any two given days, the probability of n train accidents in day 1 is equal to the
probability of n accidents in day 2. Therefore, in all likelihood the risk of train accidents on
the ﬁnal day in March and the risk of such accidents on the ﬁrst day in April would have been
about the same. The fact that train accidents occurred at random days, two per month on the
average, imply that in some months more than two and in other months two or less accidents
were occurring. 2. Let X be the ﬁne that the citizen pays on a random day. Then
E(X) = 25(0.60) + 0(0.40) = 15.
Therefore, it is much better to park legally. 72 Chapter 4 Distribution Functions and Discrete Random Variables 3. The expected value of the winning amount is
30 500
1
4000
+ 800
+ 1, 200, 000
= 0.86.
2, 000, 000
2, 000, 000
2, 000, 000 Considering the cost of the ticket, the expected value of the player’s gain in one game is
−1 + 0.86 = −0.14. 4. Let X be the amount that the player gains in one game, then P (X = 4) = 4
3 6
1
10
4 = 0.114, 1
10
4 P (X = 9) = = 0.005, and P (X = −1) = 1 − 0.114 − 0.005 = 0.881. Thus
E(X) = −1(0.881) + 4(0.114) + 9(0.005) = −0.38.
Therefore, on the average, the player loses 38 cents per game. 5. Let X be the net gain in one play of the game. The set of possible values of X is {−8, −4, 0, 6, 10}.
The probabilities associated with these values are p(−8) = p(0) = and p(6) = p(10) = 2
1
5
2 = E(X) = −8 · 1
5
2 1
=
,
10 p(−4) = 2
1 2
1
5
2 = 4
,
10 2
. Hence
10
4
1
2
1
2
4
−4·
+0·
+6·
+ 10 ·
=.
10
10
10
10
10
5 Since E(X) > 0, the game is not fair. 6. The expected number of defective items is
3 i·
i =0 5
i 15
5−i
20
3 = 0.75. Section 4.4 Expectations of Discrete Random Variables 73 7. For i = 4, 5, 6, 7, let Xi be the proﬁt if i magazines are ordered. Then
E(X4 ) = 4a
,
3 E(X5 ) = 5a 12
4a
2a 6
·
+
·
=
,
3 18
3 18
3 E(X6 ) = 0 ·
E(X7 ) = − 5
6a 7
19a
6
+a·
+
·
=
,
18
18
3 18
18 2a 6
a5
4a 4
7a 3
10a
·
+·
+
·
+
·
=
.
3 18 3 18
3 18
3 18
18 Since 4a/3 > 19a/18 and 4a/3 > 10a/18, either 4, or 5 magazines should be ordered to
maximize the proﬁt in the long run.
∞ 8. (a)
x =1 6
6
=2
π 2x2
π
∞ (b) E(X) = x
x =1 2 p(x) = 9. (a)
i =−2 ∞
x =1 1
6 π2
= 1.
= 2·
x2
π
6 6
6
=2
2x2
π
π ∞
x =1 1
= ∞.
x 4
1
4
9
9
+
+
+
+
= 1.
27 27 27 27 27 E(X) = 2
x =−2 xp(x) = 0, E(X2 ) = (b) 2
x =−2 x 2 p(x) = 80/27. Hence E(X) = 2
x =−2 x p(x) = 44/27, E(2X2 − 5X + 7) = 2(80/27) − 5(0) + 7 = 349/27.
10 10. Let R be the radius of the randomly selected disk; then E(2π R) = 2π i
i =1 11. p(x) the probability mass function of X is given by
x
p(x) −3
3/8 0
1/8 3
1/4 4
1/4 Hence
1
1
1
5
3
+0· +3· +4· = ,
8
8
4
4
8
3
1
1
1
77
E(X2 ) = 9 · + 0 · + 9 · + 16 · = ,
8
8
4
4
8
E(X) = −3 · 1
= 11π.
10 74 Chapter 4 Distribution Functions and Discrete Random Variables 3
1
1
1
23
+0· +3· +4· = ,
8
8
4
4
8
23
31
77
E(X2 − 2X ) =
−2
=,
8
8
8
3
1
1
1
23
E(XX ) = −9 · + 0 · + 9 · + 16 · = .
8
8
4
4
8
E(X ) = 3 · 10 12. E(X) = i·
i =1 11
1
=
and E(X2 ) =
10
2 10 i2 ·
i =1 1
77
= . So
10
2 E X (11 − X) = E(11X − X2 ) = 11 · 11 77
−
= 22.
2
2 13. Let X be the number of different birthdays; we have
P (X = 4) = P (X = 3) = P (X = 2) =
P (X = 1) = 365 × 364 × 363 × 362
= 0.9836,
3654
4
365 × 364 × 363
2
= 0.0163,
3654
4
4
365 × 364 +
365 × 364
2
3
= 0.00007,
3654
365
= 0.000000021.
3654 Thus
E(X) = 4(0.9836) + 3(0.0163) + 2(0.00007) + 1(0.000, 000, 021) = 3.98. 14. Let X be the number of children they should continue to have until they have one of each sex.
For i ≥ 2, clearly, X = i if and only if either all of their ﬁrst i − 1 children are boys and the i th
child is a girl, or all of their ﬁrst i − 1 children are girls and the i th child is a boy. Therefore,
by independence,
P (X = i) =
So ∞ E(X) =
i =2 Note that for r  < 1, 1
2 1
i
2 i −1 ∞
i =1 i −1 ir i −1 · 1
1
+
2
2
∞ = −1 + i
i =1 i −1 1
2 = 1/[(1 − r)2 ]. · 1
1
=
2
2 i −1 i −1 = −1 + , i ≥ 2. 1
= 3.
(1 − 1/2)2 Section 4.4 Expectations of Discrete Random Variables 75 15. Let Aj be the event that the person belongs to a family with j children. Then
c c 1
αj .
j P (K = k Aj )P (Aj ) = P (K = k) =
j =0 j =k Therefore,
c c kP (K = k) = E(K) = c k k =1 k =1 j =k αj
=
j c c k =1 j =k kαj
.
j 16. Let X be the number of cards to be turned face up until an ace appears. Let A be the event
that no ace appears among the ﬁrst i − 1 cards that are turned face up. Let B be the event that
the i th card turned face up is an ace. We have P (X = i) = P (AB) = P (B A)P (A) = 4
·
52 − (i − 1) 48
i−1
.
52
i−1 Therefore,
i 49 E(X) =
i =1 48
4
i−1 = 10.6. 52
(53 − i)
i−1 To some, this answer might be counterintuitive. 17. Let X be the largest number selected. Clearly,
i
N P (X = i) = P (X ≤ i) − P (X ≤ i − 1) = n − i−1
N n , i = 1, 2, . . . , N. Hence
N E(X) =
i =1 i n+1 i(i − 1)n
1
−
=n
n
n
N
N
N N i n+1 − i(i − 1)n
i =1
N = 1
Nn N N i n+1 − (i − 1)n+1 − (i − 1)n =
i =1 For large N ,
N N (i − 1)n ≈
i =1 x n dx =
0 N n+1
.
n+1 n+1 − (i − 1)n
i =1 Nn . 76 Chapter 4 Distribution Functions and Discrete Random Variables Therefore, N n+1
n + 1 = nN .
n
N
n+1 N n+1 − E(X) ≈ 18. (a) Note that
1
1
1
=−
.
n(n + 1)
n n+1 So k
n=1 k 1
=
n(n + 1) 1
1
1
−
=1−
.
n n+1
k+1 n=1 This implies that
∞ k p(n) = lim k →∞ n=1 n=1 1
1
= 1 − lim
= 1.
k →∞ k + 1
n(n + 1) Therefore, p is a probability mass function.
∞ ∞ np(n) = (b) E(X) =
n=1 n=1 1
= ∞,
n+1 where the last equality follows since we know from calculus that the harmonic series,
1 + 1/2 + 1/3 + · · · , is divergent. Hence E(X) does not exist. 19. By the solution to Exercise 16, Section 4.3, it should be clear that for 1 ≤ k ≤ n,
2(k − 1) + 2
P (X = 2k) = 2n − 2k
2 2n
3 . Hence
n 2kP (X = 2k) = E(X) =
k =1 = = = 4k(k − 1) + 4k n 4
2n
3
4
2n
3 = 2n
3 k =1
n n k − (4n − 2)
k =1 2· (n + 1)2
.
2n − 1 n k + (2n − n − 1) 3 2 2n − 2k
2 2 k =1 2 k
n=1 n(n + 1)
n(n + 1)(2n + 1)
n2 (n + 1)2
− (4 n − 2 ) ·
+ (2n2 − n − 1)
4
6
2 Section 4.5 Variances and Moments of Discrete Random Variables 77 4.5 VARIANCES AND MOMENTS OF DISCRETE RANDOM VARIABLES
1. On average, in the long run, the two businesses have the same proﬁt. The one that has a proﬁt
with lower standard deviation should be chosen by Mr. Jones because he’s interested in steady
income. Therefore, he should choose the ﬁrst business. 2. The one with lower standard deviation, namely, the second device.
3. E(X) = 3
x =−3 xp(x) = −1, E(X2 ) = 3
x =−3 x 2 p(x) = 4. Therefore, Var(X) = 4 − 1 = 3. 4. p, the probability mass function of X is given by
x
p(x) −3
3/8 0
3/8 6
2/8 Thus
9 12
3
E(X) = − +
=,
8
8
8
99
9
783
Var(X) =
−
=
= 12.234,
8
64
64 27 72
99
+
=,
8
8
8
√
σX = 12.234 = 3.498. E(X2 ) = 5. By straightforward calculations,
N i· E(X) =
i =1 1
1 N(N + 1)
N +1
=
·
=
,
N
N
2
2 N E(X2 ) = i2 ·
i =1 Var(X) =
σX = 1 N(N + 1)(2N + 1)
(N + 1)(2N + 1)
1
=
·
=
,
N
N
6
6 (N + 1)(2N + 1) (N + 1)2
N2 − 1
−
=
,
6
4
12
N2 − 1
.
12 6. Clearly,
5 E(X) = i·
i =0 5 E(X ) = i2 · 2 i =0 13
i 39
5−i
= 1.25,
52
5
13
39
i
5−i
= 2.426.
52
5 78 Chapter 4 Distribution Functions and Discrete Random Variables Therefore, Var(X) = 2.426 − (1.25)2 = 0.864, and hence σX = √
0.864 = 0.9295. 7. By the Corollary of Theorem 4.2, E(X2 − 2X) = 3 implies that E(X2 ) − 2E(X) = 3.
Substituting E(X) = 1 in this relation gives E(X 2 ) = 5. Hence, by Theorem 4.3,
Var(X) = E(X2 ) − E (X) 2 = 5 − 1 = 4. By Theorem 4.5,
Var(−3X + 5) = 9Var(X) = 9 × 4 = 36. 8. Let X be Harry’s net gain. Then X= ⎧
⎪−2
⎪
⎪
⎪
⎨0.25
⎪0.50
⎪
⎪
⎪
⎩0.75 with probability 1/8
with probability 3/8
with probability 3/8
with probability 1/8. Thus
3
3
1
1
+ 0.25 · + 0.50 · + 0.75 · = 0.125
8
8
8
8
1
3
3
1
E(X2 ) = (−2)2 · + 0.252 · + 0.502 · + 0.752 · = 0.6875.
8
8
8
8
E(X) = −2 · These show that the expected value of Harry’s net gain is 12.5 cents. Its variance is
Var(X) = 0.6875 − 0.1252 = 0.671875. 9. Note that E(X) = E(Y ) = 0. Clearly,
0 if t ≥ 1, 0
1 P Y − 0 ≤ t = if t < 1 1 P X − 0 ≤ t = if t < 10
if t ≥ 10. These relations, clearly, show that for all t > 0,
P Y − 0 ≤ t ≤ P X − 0 ≤ t .
Therefore, X is more concentrated about 0 than Y is. 10. (a) Let X be the number of trials required to open the door. Clearly,
P (X = x) = 1 − 1
n x −1 1 n , x = 1, 2, 3, . . . . Section 4.5 Variances and Moments of Discrete Random Variables 79 Thus
∞ E(X) = 1
n x 1−
x =1 x −1 1 n = 1
n ∞ x 1−
x =1 x −1 1
n (10) . We know from calculus that ∀r , r  < 1,
∞ 1
.
(1 − r)2 xr x −1 = x =1 (11) Thus
∞ x 1−
x =1 1
n x −1 = 1
1
1− 1−
n 2 = n2 . (12) Substituting (12) in (10), we obtain E(X) = n. To calculate Var(X), ﬁrst we ﬁnd E(X 2 ). We
have
∞ E(X ) =
2 x
x =1 2 1
1−
n x −1 1
1
=
n
n ∞ x2 1 −
x =1 1
n x −1 . (13) Now to calculate this sum, we multiply both sides of (11) by r and then differentiate it with
respect to r ; we get
∞
x =1 x 2 r x −1 = 1+r
.
(1 − r)3 Using this relation in (13), we obtain 1
E(X 2 ) = ·
n 1
n
1
1− 1−
n
1+1− 3 = 2n2 − n. Therefore,
Var(X) = (2n2 − n) − n2 = n(n − 1).
(b) Let Ai be the event that on the i th trial the door opens. Let X be the number of trials
required to open the door. Then
1
P (X = 1) = ,
n 80 Chapter 4 Distribution Functions and Discrete Random Variables P (X = 2) = P (Ac A2 ) = P (A2 Ac )P (Ac )
1
1
1
= 1
n−1
1
·
=,
n−1
n
n P (X = 3) = P (Ac Ac A3 ) = P (A3 Ac Ac )P (Ac Ac )
12
21
21
= P (A3 Ac Ac )P (Ac Ac )P (Ac )
21
2
1
1
= 1
n−2 n−1
1
·
·
=.
n−2 n−1
n
n Similarly, P (X = i) = 1/n for 1 ≤ i ≤ n. Therefore, X is a random number selected from
{1, 2, 3, . . . , n}. By Exercise 5, E(X) = (n + 1)/2 and Var(X) = (n2 − 1)/12. 11. For E(X3 ) to exist, we must have E X3  < ∞. Now
∞
3
xn
n=1 6
p(xn ) = 2
π whereas ∞
n=1 ∞
3
xn p(xn ) E X  =
3 n=1 √
(−1)n n n
6
=2
2
n
π 6
=2
π ∞
n=1 ∞
n=1 √
6
nn
=2
2
n
π (−1)n
√ < ∞,
n
∞
n=1 1
√ = ∞.
n 12. For 0 < s < r , clearly,
x s ≤ max 1, x r ≤ 1 + x r , ∀x ∈ R . Let A be the set of possible values of X and p be its probability mass function. Since the r th
absolute moment of X exists, x ∈A x r p(x) < ∞. Now
1 + x r p (x) x s p(x) ≤
x ∈A x ∈A p(x) + =
x ∈A x r p(x) = 1 +
x ∈A x r p(x) < ∞,
x ∈A implies that the absolute moment of order s of X also exists. 13. Var(X)=Var(Y ) implies that
E(X2 ) − E (X) 2 2 = E(Y 2 ) − E (Y ) . Since E(X) = E(Y ), this implies that E(X2 ) = E Y 2 . Let
P (X = a) = p1 , P (X = b) = p2 , P (X = c) = p3 ; P (Y = a) = q1 , P (Y = b) = q2 , P (Y = c) = q3 . Section 4.5 Variances and Moments of Discrete Random Variables 81 Clearly,
p1 + p2 + p3 = q1 + q2 + q3 = 1.
This implies
(p1 − q1 ) + (p2 − q2 ) + (p3 − q3 ) = 0. (14) The relations E(X) = E(Y ) and E(X2 ) = E(Y 2 ) imply that
ap1 + bp2 + cp3 = aq1 + bq2 + cq3
a p1 + b2 p2 + c2 p3 = a 2 q1 + b2 q2 + c2 q3 .
2 These and equation (14) give us the following system of 3 equations in the 3 unknowns p1 − q1 ,
p2 − q2 , and p3 − q3 .
⎧
⎪ (p1 − q1 ) + (p2 − q2 ) + (p3 − q3 ) = 0
⎨
a(p1 − q1 ) + b(p2 − q2 ) + c(p3 − q3 ) = 0
⎪
⎩2
a (p1 − q1 ) + b2 (p2 − q2 ) + c2 (p3 − q3 ) = 0.
In matrix form, this is equivalent to
⎛ ⎞⎛
⎞ ⎛⎞
111
p1 − q1
0
⎝ a b c ⎠ ⎝p2 − q2 ⎠ = ⎝0⎠ .
a 2 b2 c2
0
p3 − q3
Now
⎛ ⎞
111
det ⎝ a b c ⎠ = bc2 + ca 2 + ab2 − ba 2 − cb2 − ac2
a 2 b2 c2
= (c − a)(c − b)(b − a) = 0,
since a , b, and c are three different real numbers. This implies that the matrix
⎛ ⎞
111
⎝a b c⎠
a 2 b2 c2
is invertible. Hence the solution to (15) is
p1 − q1 = p2 − q2 = p3 − q3 = 0.
Therefore, p1 = q1 , p2 = q2 , p3 = q3 implying that X and Y are identically distributed. (15) 82 Chapter 4 Distribution Functions and Discrete Random Variables 14. Let
P (X = a1 ) = p1 , P (X = a2 ) = p2 , ... , P (X = an ) = pn ; P (Y = a1 ) = q1 , P (Y = a2 ) = q2 , ... , P (Y = an ) = qn . Clearly,
p1 + p2 + · · · + pn = q1 + q2 + · · · + qn = 1.
This implies that
(p1 − q1 ) + (p2 − q2 ) + · · · + (pn − qn ) = 0.
The relations E(Xr ) = E(Y r ), for r = 1, 2, . . . , n − 1 imply that
a1 p1 + a2 p2 + · · · + an pn = a1 q1 + a2 q2 + · · · + an qn ,
2
2
2
2
2
2
a1 p1 + a2 p2 + · · · + an pn = a1 q1 + a2 q2 + · · · + an qn , .
.
.
n
n
n
n
n
n
a1 −1 p1 + a2 −1 p2 + · · · + an −1 pn = a1 −1 q1 + a2 −1 q2 + · · · + an −1 qn . These and the previous relation give us the following n equations in the n unknowns p1 − q1 ,
p2 − q2 , . . . , pn − qn .
⎧
⎪ (p1 − q1 ) + (p2 − q2 ) + · · · + (pn − qn ) = 0
⎪
⎪
⎪
⎪ a1 (p1 − q1 ) + a2 (p2 − q2 ) + · · · + an (pn − qn ) = 0
⎪
⎨
2
2
2
⎪ a1 (p1 − q1 ) + a2 (p2 − q2 ) + · · · + an (pn − qn ) = 0
⎪
⎪
⎪ ......................................................
⎪
⎪ n−1
⎩
n
n
a1 (p1 − q1 ) + a2 −1 (p2 − q2 ) + · · · + an −1 (pn − qn ) = 0
In matrix form, this is equivalent to
⎛
⎜
⎜
⎜
⎜
⎜
⎝ 1
a1
2
a1
.
.
. 1
a2
2
a2
.
.
. ···
···
··· n
n
a1 −1 a2 −1 · · · Now ⎛ 1
a1
2
a1
.
.
. 1
a2
2
a2
.
.
. ···
···
··· ⎜
⎜
⎜
det ⎜
⎜
⎝
n
n
a1 −1 a2 −1 · · · ⎞ ⎛⎞
p1 − q1
0
⎟ ⎜p2 − q2 ⎟ ⎜0⎟
⎟⎜
⎟ ⎜⎟
⎟ ⎜ p3 − q3 ⎟ ⎜0⎟
⎟⎜
⎟ = ⎜ ⎟.
⎟ ⎜ . ⎟ ⎜.⎟
. ⎠ ⎝.⎠
⎠⎝ .
.
n−1
pn − qn
0
an
1
an
2
an
.
.
. 1
an
2
an
.
.
.
n
an −1 ⎞⎛ ⎞
⎟
⎟
⎟
(aj − ai ) = 0,
⎟=
⎟
⎠ j =n,n−1,... ,2
i<j (16) Section 4.6 Standardized Random Variables 83 since ai ’s are all different real numbers. The formula for the determinant of this type of
matrices is well known. These are referred to as Vandermonde determinants, after the famous
French mathematicianA. T. Vandermonde (1735–1796). The above determinant being nonzero
implies that the matrix
⎛
⎞
1
1
···
1
⎜ a1
a2 · · ·
an ⎟
⎜
⎟
2
2
2
⎜a
a2 · · ·
an ⎟
⎜1
⎟
⎜.
.
.⎟
.
.
.⎠
⎝.
.
.
n
n
n
a1 −1 a2 −1 · · · an −1
is invertible. Hence the solution to (16) is
p1 − q1 = p2 − q2 = · · · = pn − qn = 0.
Therefore, p1 = q1 , p2 = q2 , . . . , pn = qn , implying that X and Y are identically distributed. 4.6 STANDARDIZED RANDOM VARIABLES 1. Let X1 be the number of TV sets the salesperson in store 1 sells and X2 be the number of ∗
TV sets the salesperson in store 2 sells. We have that X1 = (10 − 13)/5 = −0.6 and
∗
X2 = (6 − 7)/4 = −0.25. Therefore, the number of TV sets the salesperson in store 2 sells
is 0.6 standard deviations below the mean, whereas the number of TV sets the salesperson
in store 2 sells is 0.25 standard deviations below the mean. So Mr. Norton should hire the
salesperson who worked in store 2. 2. Let X be the ﬁnal grade comparable to Velma’s 82 in the midterm. We must have
82 − 72
X − 68
=
.
12
15
This gives X = 80.5. REVIEW PROBLEMS FOR CHAPTER 4 1. Note that 10
= 45. We have
2
i
p(i) 1, 2, 16, 17
1/45 3, 4, 14, 15
2/45 5, 6, 12, 13
3/45 7, 8, 10, 11
4/45 9
5/45 84 Chapter 4 Distribution Functions and Discrete Random Variables 2. The answer is
1· 5
9
9
4
5
2
+2·
+3·
+4·
+5·
+6·
= 3.676.
34
34
34
34
34
34 3. Let N be the number of secretaries to be interviewed to ﬁnd one who knows TEX. We must ﬁnd the least n for which P (N ≤ n) ≥ 0.50 or 1 − P (N > n) ≥ 0.50 or 1 − (0.98)n ≥ 0.50.
This gives (0.98)n ≤ 0.50 or n ≥ ln 0.50/ ln 0.98 = 34.31. Therefore, n = 35. 4. Let F be the distribution function of X, then
F (t) = 1 − 1 + t
e−t/200 ,
200 t ≥ 0. Using this, we obtain
P (200 ≤ X ≤ 300) = P (X ≤ 300) − P (X < 200) = F (300) − F (200−)
= F (300) − F (200) = 0.442 − 0.264 = 0.178. 5. Let X be the number of sections that will get a hard test. We want to calculate E(X). The
random variable X can only assume the values 0, 1, 2, 3, and 4; its probability mass function
is given by
8
22
i 4−i
p(i) = P (X = i) =
, i = 0, 1, 2, 3, 4,
30
4
where the numerical values of p(i)’s are as follows.
i
p(i) 0
0.2669 1
0.4496 2
0.2360 3
0.0450 4
0.0026 Thus
E(X) = 0(0.2669) + 1(0.4496) + 2(0.2360) + 3(0.0450) + 4(0.00026) = 1.067. 6. (a) 1 − F (6) = 5/36. (b) F (9) = 76/81. (c) F (7) − F (2) = 44/49. 7. We have that
E(X) = (15.85)(0.15) + (15.9)(0.21) + (16)(0.35) + (16.1)(0.15) + (16.2)(0.14) = 16,
Var(X) = (15.85 − 16)2 (0.15) + (15.9 − 16)2 (0.21) + (16 − 16)2 (0.35)
+ (16.1 − 16)2 (0.15) + (16.2 − 16)2 (0.14) = 0.013.
E(Y ) = (15.85)(0.14) + (15.9)(0.05) + (16)(0.64) + (16.1)(0.08) + (16.2)(0.09) = 16,
Var(Y ) = (15.85 − 16)2 (0.14) + (15.9 − 16)2 (0.05) + (16 − 16)2 (0.64)
+ (16.1 − 16)2 (0.08) + (16.2 − 16)2 (0.09) = 0.008. Chapter 4 Review Problems 85 These show that, on the average, companies A and B ﬁll their bottles with 16 ﬂuid ounces of
soft drink. However, the amount of soda in bottles from company A vary more than in bottles
from company B . 8. Let F be the distribution function of X, Then
⎧
⎪0
⎪
⎪
⎪
⎪
⎪7/30
⎪
⎪
⎪
⎪
⎪
⎨13/30
F (t) =
⎪18/30
⎪
⎪
⎪
⎪
⎪
⎪23/30
⎪
⎪
⎪
⎪
⎩
1
∞ 9. (a) To determine the value of k , note that
i =0 implies that ke = 1 or k = e
2t −2t t < 58
58 ≤ t < 62
62 ≤ t < 64
64 ≤ t < 76
76 ≤ t < 80
t ≥ 80.
(2t)i
k
= 1. Therefore, k
i! . Thus p(i) = e −2t (2t)i
.
i! ∞
i =0 (2t)i
= 1. This
i! (b)
3 P (X = i) = e−2t 1 + 2t + 2t 2 + (4t 3 /3) , P (X < 4) =
i =0 P (X > 1) = 1 − P (X = 0) − P (X = 1) = 1 − e−2t − 2te−2t . 10. Let p be the probability mass function, and F be the distribution function of X. We have
1
3
p(0) = p(3) = , p(1) = p(2) = , and
8
8
⎧
⎪0
⎪
⎪
⎪
⎪
⎪1/8
⎪
⎪
⎪
⎨
F (t) = 4/8
⎪
⎪
⎪
⎪7/8
⎪
⎪
⎪
⎪
⎪
⎩
1 t <0
0≤t <1
1≤t <2
2≤t <3
t ≥ 3. 11. (a) The sample space has 52! elements because when the cards are dealt face down, any
ordering of the cards is a possibility. To ﬁnd p(j ), the probability that the 4th king
4
will appear on the j th card, we claim that in
· (j − 1) P · 48! ways the 4th king
3
1
will appear on the j th card, and the remaining 3 kings earlier. To see this, note that 86 Chapter 4 Distribution Functions and Discrete Random Variables 4
combinations for the king that appears on the j th card, and (j − 1) P
3
1
different permutations for the remaining 3 kings that appear earlier. The last term 48!,
is for the remaining 48 cards that can appear in any order in the remaining 48 positions.
Therefore,
we have p(j ) = 4
· (j − 1) P · 48!
3
1
=
52! (b) The probability that the player wins is p(52) = (c) j −1
3
52!
4! 48! = 51
3 j −1
3
.
52
4 52
= 1/13.
4 To ﬁnd
52 52 1
j −1
j
,
52
3
j =4
j =4
4
the expected length of the game, we use a technique introduced by Jenkyns and Muller
in Mathematics Magazine, 54, (1981), page 203. We have the following relation which
can be readily checked.
E= jp(j ) = j
j −1
j −1
4
(j + 1)
−j
=
5
4
4
3 j , j ≥ 5. This gives
52 j
j =5 j −1
4
=
5
3
= 52 52 (j + 1)
j =5 j
j −1
−
j
4
4
j =5 4
52
4
53
−5
5
4
4 = 11, 478, 736, where the nexttothelast equality follows because terms cancel out in pairs. Thus
52 E= = 52 j −1
1
j −1
1
j
=
j
4+
52
52
3
3
j =4
j =5
4
4
1
(4 + 11, 478, 736) = 42.4.
52
4 As Jenkyns and Muller have noted, “This relatively high expectation value is what makes the
game interesting. However, the low probability of winning makes it frustrating!” Chapter 5 Special Discrete
Distributions
5.1
1. BERNOULLI AND BINOMIAL RANDOM VARIABLES
8
4 1
4 3
4 4 4 = 0.087. 1
= 32.
2
1
(b) 6 × + 1 = 4 (note that we should count the mother of the family as well).
2 2. (a) 64 × 3. 6
3 1
6 4. 6
2 1
10 2 9
10 4 5. 5
2 10
30 2 20
30 3 5
6 3 3 = 0.054.
= 0.098.
= 0.33. 6. Let X be the number of defective nails. If the manufacturer’s claim is true, we have
P (X ≥ 2) = 1 − P (X = 0) − P (X = 1)
24
24
=1−
(0.03)0 (0.97)24 −
(0.03)(0.97)23 = 0.162.
0
1
This shows that there is 16.2% chance that two or more defective nails is found. Therefore, it
is not fair to reject company’s claim. 7. Let p and q be the probability mass functions of X and Y , respectively. Then
p(x) = 4
(0.60)x (0.40)4−x ,
x x = 0, 1, 2, 3, 4; 88 Chapter 5 Special Discrete Distributions q(y) = P (Y = y) = P X =
4 =
8 8.
i =0 9. y−1
2 (0.60)(y −1)/2 (0.40)4−[(y −1)/2] , y −1
2 y = 1, 3, 5, 7, 9. 15
(0.8)i (0.2)15−i = 0.142.
i 10
5 11
36 10. (a) 1 − 5 5
0 25
36
1
3 5 0 = 0.108.
2
3 5 − 5
1 1
3 1 2
3 4 = 0.539. (b) 5
2 1
10 2 9
10 3 = 0.073. 11. We know that p(x) is maximum at [(n + 1)p]. If (n + 1)p is an integer, p(x) is maximum at
[(n + 1)p ] = np + p . But in such a case, some straightforward algebra shows that n
n
p np+p−1 (1 − p)n−np−p+1 ,
p np+p (1 − p)n−np−p =
np + p − 1
np + p
implying that p(x) is also maximum at np + p − 1.
52
. If Ernie plays n games, he will get, on
5
52
royal or straight ﬂushes. We want to have 40n
= 1; this
5 12. The probability of royal or straight ﬂush is 40
the average, n 40
gives n = 13. 6
3 1
3 3 52
5
2
3 3 52
5 40 = 64, 974.
= 0.219. 14. 1 − (999/1000)100 = 0.095.
15. The maximum occurs at k = [11(0.45)] = 4. The maximum probability is
10
(0.45)4 (0.55)6 = 0.238.
4 16. Call the event of obtaining a full house success. X, the number of full houses is n independent
poker hands is a binomial random variable with parameters (n, p), where p is the probability
52
that a random poker hand is a full house. To calculate p , note that there are
possible
5
4 4 13!
52
poker hands and
= 3744 full houses. Thus p = 3744
≈ 0.0014. Hence
3 2 11!
5 Section 5.1 Bernoulli and Binomial Random Variables 89 E(X) = np ≈ 0.0014n and Var(X) = np(1 − p) ≈ 0.00144n. Note that if n is approximately
715, then E(X) = 1. Thus we should expect to ﬁnd, on the average, one full house in every
715 random poker hands. 17. 1 − 6
6 1
4 18. 1 − 3000
3000
(0.0005)0 (0.9995)3000 −
(0.0005)(0.9995)2999 ≈ 0.442.
0
1 6 3
4 0 − 6
5 1
4 5 3
≈ 0.995.
4 19. The expected value of the expenses if sent in one parcel is
45.20 × 0.07 + 5.20 × 0.93 = 8.
The expected value of the expenses if sent in two parcels is
(23.30 × 2)(0.07)2 + (23.30 + 3.30) 2
(0.07)(0.93) + (6.60)(0.93)2 = 9.4.
1 Therefore, it is preferable to send in a single parcel. 20. Let n be the minimum number of children they should plan to have. Since the probability of all
girls is (1/2)n and the probability of all boys is (1/2)n , we must have 1−(1/2)n −(1/2)n ≥ 0.95.
ln 0.05
This gives (1/2)n−1 ≤ 0.05 or n − 1 ≥
= 4.32 or n ≥ 5.32. Therefore, n = 6.
ln(0.5) 21. (a) For this to happen, exactly one of the N stations has to attempt transmitting a message.
The probability of this is N
p(1 − p)N −1 = Np(1 − p)N −1 .
1 (b) Let f (p) = Np(1 − p)N −1 . The value of p which maximizes the probability of a message
going through with no collision is the root of the equation f (p) = 0. Now
f (p) = N(1 − p)N −1 − Np(N − 1)(1 − p)N −2 = 0.
Noting that p = 1, this equation gives p = 1/N. This answer makes a lot of sense because at
every “suitable instance,” on average, Np = 1 station will transmit a message.
(c) By part (b), the maximum probability is
f 1
N =N 1
N 1− 1
N N −1 = 1− 1
N N −1 . As N → ∞, this probability approaches 1/e, showing that for large numbers of stations
(in reality 20 or more), the probability of a successful transmission is approximately 1/e
independently of the number of stations if p = 1/N . 90 Chapter 5 Special Discrete Distributions 22. The k students whose names have been called are not standing. Let A1 , A2 , . . . , An−k be the students whose names have not been called. For i , 1 ≤ i ≤ n − k , call Ai a “success,” if he or
she is standing; failure, otherwise. Therefore, whether Ai is standing or sitting is a Bernoulli
trial, and hence the random variable X is the number of successes in n − k Bernoulli trials.
For X to be binomial, for i = j , the event that Ai is a success must be independent of the
event that Aj is a success. Furthermore, the probability that Ai is a success must be the same
for all i , 1 ≤ i ≤ n − k. The latter condition is satisﬁed since Ai is standing if and only if his
original seat was among the ﬁrst k . This happens with probability p = k/n regardless of i .
However, the former condition is not valid. The relation
P Aj is standing  Ai is standing = k−1
,
n shows that given Ai is a success changes the probability that Aj is success. That is, Ai being a
success is not independent of Aj being a success. This shows that X is not a binomial random
variable. 23. Let X be the number of undecided voters who will vote for abortion. The desired probability
is
n + (b − a)
P b + (n − X) > a + X = P X <
=
2
= 1
2 [ n+(b−a) ]
2
n
i =0 [ n+(b−a) ]
2
i =0 n
i 1
2 i 1
2 n−i n
.
i 24. Let X be the net gain of the player per unit of stake. X is a discrete random variable with
possible values −1, 1, 2, and 3. We have
3
0
3
P (X = 1) =
1
3
P (X = 2) =
2
3
P (X = 3) =
3 P (X = −1) = Hence 1
6
1
6
1
6
1
6 0 2 3 5 3 125
=
,
6
216
75
52
=
,
6
216
15
5
=
,
6
216
50
1
=
.
6
216 75
15
1
125
+1·
+2·
+3·
≈ −0.08.
216
216
216
216
Therefore, the player loses 0.08 per unit stake.
E(X) = −1 · Section 5.1 Bernoulli and Binomial Random Variables 91 25.
n E(X ) =
2 x nx
p (1 − p)n−x =
x 2 x =1
n n (x 2 − x + x)
x =1 nx
p (1 − p)n−x
x n = nx
nx
p (1 − p)n−x +
x
p (1 − p)n−x
x
x
x =1 x(x − 1)
x =1
n =
x =2 n!
px (1 − p)n−x + E(X)
(x − 2)! (n − x)!
n = n(n − 1)p2
x =2 n − 2 x −2
p (1 − p)n−x + np
x−2 = n(n − 1)p2 p + (1 − p) n−2 + np = n2 p2 − np2 + np. 26. (a) A fourengine plane is preferable to a twoengine plane if and only if
1− 40
4
20
p (1 − p)4 −
p (1 − p)3 > 1 −
p (1 − p)2 .
0
1
0 This inequality gives p > 2/3. Hence a fourengine plane is preferable if and only if p > 2/3.
If p = 2/3, it makes no difference.
(b) A ﬁveengine plane is preferable to a threeengine plane if and only if
55
54
53
32
p (1 − p)0 +
p (1 − p) +
p (1 − p)2 >
p (1 − p) + p 3 .
5
4
3
2
Simplifying this inequality, we get 3(p − 1)2 (2p − 1) ≥ 0 which implies that a ﬁveengine
plane is preferable if and only if 2p − 1 ≥ 0. That is, for p > 1/2, a ﬁveengine plane is
preferable; for p < 1/2, a threeengine plane is preferable; for p = 1/2 it makes no difference. 27. Clearly, 8 bits are transmitted. A parity check will not detect an error in the 7–bit character
received erroneously if and only if the number of bits received incorrectly is even. Therefore,
the desired probability is
4 8
(1 − 0.999)2n (0.999)8−2n = 0.000028.
2n n=1 28. The message is erroneously received but the errors are not detected by the paritycheck if for
1 ≤ j ≤ 6, j of the characters are erroneously received but not detected by the parity–check,
and the remaining 6 − j characters are all transmitted correctly. By the solution of the previous
exercise, the probability of this event is
6 (0.000028)j (0.999)8(6−j ) = 0.000161.
j =1 92 Chapter 5 Special Discrete Distributions 29. The probability of a straight ﬂush is 40
1− 52
≈ 0.000015391. Hence we must have
5 n
3
(0.000015391)0 (1 − 0.000015391)n ≥ .
0
4 This gives
1
(1 − 0.000015391)n ≤ .
4
So
n≥ log(1/4)
≈ 90071.06.
log(1 − 0.000015391) Therefore, n ≈ 90, 072. 30. Let p, q , and r be the probabilities that a randomly selected offspring is AA, Aa , and aa ,
respectively. Note that both parents of the offspring are AA with probability (α/n)2 , they are
2
both Aa with probability 1 − (α/n) , and the probability is 2(α/n) 1 − (α/n) that one
parent is AA and the other is Aa . Therefore, by the law of total probability,
α
n
α
q =0·
n
α
r =0·
n p =1· α2 1
1
α
α
1α2 1α
1
· 1−
+ ·2
+
1−
=
+,
4
n
2
n
n
4n
2n
4
2
2
2
1
α
α
α
1 1α
1
+ 1−
1−
=−
+ ·2
,
2
n
2
n
n
2 2n
2
α2
α
1
α2
1
α
1−
=
1−
+ 1−
+0·2
.
4
n
n
n
4
n 2 + The probability that at most two of the offspring are aa is
2
i =0 mi
r (1 − r)m−i .
i The probability that exactly i of the offspring are AA and the remaining are all Aa is
m i m−i
pq .
i 31. The desired probability is the sum of three probabilities: probability of no customer served and
two new arrivals, probability of one customer served and three new arrivals, and probability
of two customers served and four new arrivals. These quantities, respectively, are (0.4)4 ·
4
4
4
4
(0.45)2 (0.55)2 ,
(0.6)(0.4)3 ·
(0.45)3 (0.55), and
(0.6)2 (0.4)2 · (0.45)4 . The
2
1
3
2
sum of these quantities, which is the answer, is 0.054. Section 5.1 93 Bernoulli and Binomial Random Variables 32. (a) Let S be the event that the ﬁrst trial is a success and E be the event that in n trials, the
number of successes is even. Then
P (E) = P (E S)P (S) + P (E S c )P (S c ).
Thus
rn = (1 − rn−1 )p + rn−1 (1 − p).
Using this relation, induction, and r0 = 1, we ﬁnd that
1
1 + (1 − 2p)n .
2 rn = (b) The left sum is the probability of 0, 2, 4, . . . , or [n/2] successes. Thus it is the probability
of an even number of successes in n Bernoulli trials and hence it is equal to rn . 33. For 0 ≤ i ≤ n, let Bi be the event that i of the balls are red. Let A be the event that in drawing
k balls from the urn, successively, and with replacement, no red balls appear. Then P (B0 A) = P (AB0 )P (B0 ) = n 1×
n P (ABi )P (Bi )
i =0 i =0 n−i
n 1
2
k n n
i 1
2 n 1 = n
i =0 n
i n−i
n .
k 34. Let E be the event that Albert’s statement is the truth and F be the event that Donna tells the
truth. Since Rose agrees with Donna and Rose always tells the truth, Donna is telling the truth
as well. Therefore, the desired probability is P (E  F ) = P (EF )/P (F ). To calculate P (F ),
observe that for Rose to agree with Donna, none, two, or all four of Albert, Brenda, Charles,
and Donna should have lied. Since these four people lie independently, this will happen with
probability
14
4 2212
2 4 41
+
+
=.
23
3
3
3
81
To calculate P (EF ), note that EF is the event that Albert tells the truth and Rose agrees with
Donna. This happens if all of them tell the truth, or Albert tells the truth but exactly two of
Brenda, Charles and Donna lie. Hence
P (EF ) = 1
3 4 + 1
3
·
3
2 2
3 2 1
13
=.
3
81 Therefore,
P (E  F ) = P (EF )
13/81
13
=
=
= 0.317.
P (F )
41/81
41 94 Chapter 5 5.2 Special Discrete Distributions POISSON RANDOM VARIABLES 1. λ = (0.05)(60) = 3; the answer is 1 −
2. λ = 1.8; the answer is 3
i =0 e−3 30
= 1 − e−3 = 0.9502.
0! e−1.8 (1.8)i
≈ 0.89.
i! 3. λ = 0.025 × 80 = 2; the answer is 1 − e−2 20 e−2 21
−
= 1 − 3e−2 = 0.594.
0!
1! 4. λ = (500)(0.0014) = 0.7. The answer is 1 − e−0.7 (0.7)0 e−0.7 (0.7)1
−
≈ 0.156.
0!
1! 5. We call a room “success” if it is vacant next Saturday; we call it “failure” if it is occupied.
Assuming that next Saturday is a random day, X , the number of vacant rooms on that day is
approximately Poisson with rate λ = 35. Thus the desired probability is
29 1−
i =0 e−35 (35)i
= 0.823.
i! 6. λ = (3/10)35 = 10.5. The probability of 10 misprints in a given chapter is
0.124. Therefore, the desired probability is (0.124)2 = 0.0154. 7. P (X = 1) = P (X = 3) implies that e−λ λ =
√
6 is e− √ 5! 6 5 e−10.5 (10.5)10
=
10! √
e−λ λ3
from which we get λ = 6. The answer
3! = 0.063. 8. The probability that a bun contains no raisins is
4 −2n/k
e
(1 − e−n/k )2 .
2 e−n/k (n/k)0
= e−n/k . So the answer is
0! 9. Let X be the number of times the randomly selected kid has hit the target. We are given that
P (X = 0) = 0.04; this implies that
Now e−λ 20
= 0.04 or e−λ = 0.04. So λ = − ln 0.04 = 3.22.
0! P (X ≥ 2) = 1 − P (X = 0) − P (X = 1) = 1 − 0.04 −
= 1 − 0.04 − (0.04)(3.22) = 0.83.
Therefore, 83% of the kids have hit the target at least twice. e−λ λ
1! Section 5.2 Poisson Random Variables 95 10. First we calculate pi ’s from binomial probability mass function with n = 26 and p = 1/365.
Then we calculate them from Poisson probability mass function with parameter λ = np =
26/365. For different values of i , the results are as follows.
i
0
1
2
3 Binomial
0.93115
0.06651
0.00228
0.00005 Poisson
0.93125
0.06634
0.00236
0.00006. Remark: In this example, since success is very rare, even for small n’s Poisson gives good
approximation for binomial. The following table demonstrates this fact for n = 5.
i
0
1
2 Binomial
0.9874
0.0136
0.00007 Poisson
0.9864
0.0136
0.00009. 11. Let N(t) be the number of shooting stars observed up to time t . Let one minute be the unit of
time. Then N (t) : t ≥ 0 is a Poisson process with λ = 1/12. We have that
P N (30) = 3 = e−30/12 (30/12)3
= 0.21.
3! 12. P N (2) = 0 = e−3(2) = e−6 = 0.00248.
13. Let N(t) be the number of wrong calls up to t . If one day is taken as the time unit, it is reasonable
to assume that N (t) : t ≥ 0 is a Poisson process with λ = 1/7. By the independent increment
property and stationarity, the desired probability is
P N (1) = 0 = e−(1/7)·1 = 0.87. 14. Choose one month as the unit of time. Then λ = 5 and the probability of no crimes during
any given month of a year is P N (1) = 0 = e−5 = 0.0067. Hence the desired probability is
12
(0.0067)2 (1 − 0.0067)10 = 0.0028.
2 15. Choose one day as the unit of time. Then λ = 3 and the probability of no accidents in one day
is P N (1) = 0 = e−3 = 0.0498. The number of days without any accidents in January is approximately another Poisson random
variable with approximate rate 31(0.05) = 1.55. Hence the desired probability is
e−1.55 (1.55)3
≈ 0.13.
3! 96 Chapter 5 Special Discrete Distributions 16. Choosing one hours as time unit, we have that λ = 6. Therefore, the desired probability is
P N (0.5) = 1 and N(2.5) = 10 = P N (0.5) = 1 and N(2.5) − N (0.5) = 9
= P N (0.5) = 1 P N (2.5) − N (0.5) = 9
= P N (0.5) = 1 P N (2) = 9
= 31 e−3 129 e−12
·
≈ 0.013.
1!
9! 17. The expected number of fractures per meter is λ = 1/60. Let N(t) be the number of fractures
in t meters of wire. Then
e−t/60 (t/60)n
, n = 0, 1, 2, . . . .
n!
In a ten minute period, the machine turns out 70 meters of wire. The desired probability,
P N (70) > 1 is calculated as follows:
P N (t) = n = P N (70) > 1 = 1 − P N (70) = 0 − P N (70) = 1
70
= 1 − e−70/60 − e−70/60 ≈ 0.325.
60 18. Let the epoch at which the trafﬁc light for the left–turn lane turns red be labeled t = 0. Let
N(t) be the number of cars that arrive at the junction at or prior to t trying to turn left. Since
cars arrive at the junction according to a Poisson process, clearly, N (t) : t ≥ 0 is a stationary
and orderly process which possesses independent increments. Therefore, N (t) : t ≥ 0 is
also a Poisson process. Its parameter is given by λ = E N (1) = 4(0.22) = 0.88. (For a
rigorous proof, see the solution to Exercise 9, Section 12.2.) Thus
e−(0.88)t (0.88)t
P N (t) = n =
n!
and the desired probability is n e−(0.88)3 (0.88)3
n! n 3 P N (3) ≥ 4 = 1 −
n=0 , ≈ 0.273. 19. Let X be the number of earthquakes of magnitude 5.5 or higher on the Richter scale during the
next 60 years. Clearly, X is a Poisson random variable with parameter λ = 6(1.5) = 9. Let A
be the event that the earthquakes will not damage the bridge during the next 60 years. Since
the events {X = i }, i = 0, 1, 2, . . . , are mutually exclusive and ∞ 1 {X = i } is the sample
i=
space, by the Law of Total Probability (Theorem 3.4),
∞ P (A) = ∞ P (A  X = i)P (X = i) =
i =0
∞ = i =0 (0.985)i
i =0 (1 − 0.015)i e−9 9i
= e−9
i! ∞
i =0 (0.985)(9)
i! e−9 9i
i! i = e−9 e(0.985)(9) = 0.873716. Section 5.2 Poisson Random Variables 97 20. Let N be the total number of letter carriers in America. Let n be the total number of dog bites
letter carriers sustain. Let X be the number of bites a randomly selected letter carrier, say Karl,
sustains on a given year. Call a bite “success,” if it is Karl that is bitten and failure if anyone
but Karl is bitten. Since the letter carriers are bitten randomly, it is reasonable to assume that
X is approximately a binomial random variable with parameters n and p = 1/N . Given that
n is large (it was more than 7000 in 1983 and at least 2,795 in 1997), 1/N is small, and n/N is
moderate, X can be approximated by a Poisson random variable with parameter λ = n/N. We
know that P (X = 0) = 0.94. This implies that (e−λ · λ0 )/0! = 0.94. Thus e−λ = 0.94, and
hence λ = − ln 0.94 = 0.061875. Therefore, X is a Poisson random variable with parameter
0.061875. Now
P X>1X≥1 = = 1 − P (X = 0) − P (X = 1)
P (X > 1)
=
P (X ≥ 1)
1 − P (X = 0)
1 − 0.94 − 0.0581625
= 0.030625,
1 − 0.94 where e−λ · λ1
= λe−λ = (0.061875)(0.94) = 0.0581625.
1!
Therefore, approximately 3.06% of the letter carriers who sustained one bite, will be bitten
again.
P (X = 1) = e−nM/N (nM/N )0
≥ α. This gives n ≥ −N ln(1 − α)/M. The
0!
answer is the least integer greater than or equal to −N ln(1 − α)/M. 21. We should ﬁnd n so that 1 − 22. (a) For each k combination n1 , n2 , . . . , nk of 1, 2, . . . , n, there are (n − 1)n−k distributions
with exactly k matches, where the matches occur at n1 , n2 , . . . , nk . This is because each of
the remaining n − k balls can be placed into any of the cells except the cell that has the same
n
number as the ball. Since there are
k combinations n1 , n2 , . . . , nk of 1, 2, . . . , n, the total
k
number of ways we can place the n balls into the n cells so that there are exactly k matches is
n
(n − 1)n−k
n
k
n−k
(n − 1) . Hence the desired probability is
.
k
nn
(b) Let X be the number of matches. We will show that limn→∞ P (X = k) = e−1 /k !; that is,
X is Poisson with parameter 1. We have lim P (X = k) = lim n→∞ n→∞ n
(n − 1)n−k
n
k
= lim
n
n→∞ k
n 1
n!
1
·
· 1−
n→∞ k ! (n − k)!
n = lim n · n−1
n n (n − 1)−k 1
1
= e−1 ·
k
(n − 1)
k! 98 Chapter 5 Special Discrete Distributions Note that limn→∞ 1 − 1
n n = e−1 , and lim n→∞ formula, n!
= 1, since by Stirling’s
(n − k)! (n − 1)k √
n!
2π n · nn · e−n
lim
= lim √
n→∞ (n − k)! (n − 1)k
n→∞
2π(n − k) · (n − k)n−k · e−(n−k) · (n − 1)k
= lim n→∞ (n − k)k 1
n
nn
·
·
·
n − k (n − k)n (n − 1)k ek = 1· · ek · 1 ·
where 1
= 1,
ek (n − k)n
k
nn
→ ek because
= 1−
(n − k)n
nn
n n → e−k . 23. (a) The probability of an even number of events in (t, t + α) is
∞
n=0 e−λα (λα)2n
= e−λα
(2n)!
=e ∞
n=0 1
(λα)2n
= e−αλ
(2n)!
2 ∞ (λα)n 1
+
n!
2 n=0 ∞
n=0 (−λα)n
n! 1 λα 1 −λα
1
e+e
= (1 + e−2λα ).
2
2
2 −αλ (b) The probability of an odd number of events in (t, t + α) is
∞
n=1 e−λα (λα)2n−1
= e−λα
(2n − 1)! ∞
n=1 = e−λα (λα)2n−1
1
= e−λα
(2n − 1)!
2 ∞
n=0 (λα)n 1
−
n!
2 ∞
n=0 (−λα)n
n! 1 λα 1 −λα
1
= 1 − e−2λα .
e−e
2
2
2 24. We have that
P N1 (t) = n, N2 (t) = m
∞ = P N1 (t) = n, N2 (t) = m  N (t) = i P N (t) = i
i =0 = P N1 (t) = n, N2 (t) = m  N (t) = n + m P N (t) = n + m
= e−λt (λt)n+m
n+m n
.
p (1 − p)m ·
(n + m)!
n Therefore,
∞ P N1 (t) = n = P N1 (t) = n, N2 (t) = m
m=0 Section 5.3
∞ =
m=0
∞ = Other Discrete Random Variables 99 n+m n
e−λt (λt)n+m
p (1 − p)m ·
n
(n + m)! (n + m)! n
e−λtp e−λt (1−p) (λt)n (λt)m
p (1 − p)m
n! m!
(n + m)!
m=0
∞ = e−λtp e−λt (1−p) (λtp)n λt (1 − p)
n! m!
m=0 = e−λtp (λtp)n
n! = m e−λtp (λtp)n
.
n! ∞ e−λt (1−p) λt (1 − p)
m!
m=0 m It can easily be argued that the other properties of Poisson process are also satisﬁed for the
process N1 (t) : t ≥ 0 . So N1 (t) : t ≥ 0 is a Poisson process with rate λp . By symmetry,
N2 (t) : t ≥ 0 is a Poisson process with rate λ(1 − p). 25. Let N(t) be the number of females entering the store between 0 and t . By Exercise 24,
N (t) : t ≥ 0 is a Poisson process with rate 1 · (2/3) = 2/3. Hence the desired probability is
e−15(2/3) 15(2/3)
P N (15) = 15 =
15! 15 = 0.035. 26. (a) Let A be the region whose points have a (positive) distance d or less from the given tree.
The desired probability is the probability of no trees in this region and is equal to
2
e−λπ d (λπ d 2 )0
= e−λπ d .
0!
2 (b) We want to ﬁnd the probability that the region A has at most n − 1 trees. The desired
quantity is
n−1 −λπ d 2
e
(λπ d 2 )i
.
i!
i =0 27. p(i) = (λ/i)p(i − 1) implies that for i < λ, the function p is increasing and for i > λ it is
decreasing. Hence i = [λ] is the maximum. 5.3 OTHER DISCRETE RANDOM VARIABLES 1. Let D denote a defective item drawn, and N denote a nondefective item drawn. The answer
is S = N N N, DN N, N DN, N N D, N DD, DND, DDN . 100 Chapter 5 Special Discrete Distributions 2. S = s s, f ss, sf s, sff s, ff ss, f sf s, sfff s, f sff s, fff ss, ff sf s, . . . .
3. (a) 1/(1/12) = 12. (b) 11
12 2 1
≈ 0.07.
12 4. (a) (1 − pq)r −1 pq. (b) 1/pq .
5. 7
(0.2)3 (0.8)5 ≈ 0.055.
2 6. (a) (0.55)5 (0.45) ≈ 0.023. (b) (0.55)3 (0.45)(0.55)3 (0.45) ≈ 0.0056.
7. 5
1 45
7 50
= 0.42.
8 8. The probability that at least n light bulbs are required is equal to the probability that the ﬁrst
n − 1 light bulbs are all defective. So the answer is p n−1 . 9. We have n−1 x
p (1 − p)n−x
x
x−1
=.
nx
n
p (1 − p)n−x
x P (N = n)
=
P (X = x) 10. Let X be the number of the words the student had to spell until spelling a word correctly. The
random variable X is geometric with parameter 0.70. The desired probability is given by
4 (0.30)i −1 (0.70) = 0.9919. P (X ≤ 4) =
i =1 11. The average number of digits until the ﬁfth 3 is 5/(1/10) = 50. So the average number of
digits before the ﬁfth 3 is 49. 12. The probability that a random bridge hand has three aces is p= 4
3 48
10
52
13 = 0.0412. Therefore, the average number of bridge hands until one has three aces is 1/p = 1/0.0412 =
24.27. 13. Either the (N + 1)st success must occur on the (N + M − m + 1)st trial, or the (M + 1)st Section 5.3 Other Discrete Random Variables 101 failure must occur on the (N + M − m + 1)st trial. The answer is
N +M −m
N 1
2 N +M −m+1 + N +M −m
M 1
2 N +M −m+1 . 14. We have that X + 10 is negative binomial with parameters (10, 0.15). Therefore, ∀i ≥ 0,
P (X = i) = P (X + 10 = i + 10) = i+9
(0.15)10 (0.85)i .
9 15. Let X be the number of good diskettes in the sample. The desired probability is P (X ≥ 9) = P (X = 9) + P (X = 10) = 10 90
1
9
100
10 + 90 10
10
0
100
10 ≈ 0.74. 16. We have that 560(0.35) = 196 persons make contributions. So the answer is 1− 364
15
560
15 − 364 196
14
1
560
15 = 0.987. 17. The transmission of a message takes more than t minutes, if the ﬁrst [t/2] + 1 times it is sent it will be garbled, where [t/2] is the greatest integer less than or equal to t/2. The probability
of this is p[t/2]+1 . 18. The probability that the sixth coin is accepted on the nth try is
n−1
(0.10)6 (0.90)n−6 .
5
Therefore, the desired probability is
∞
n=50 49 n−1
n−1
(0.10)6 (0.90)n−6 = 1 −
(0.10)6 (0.90)n−6 = 0.6346.
5
5
n=6 19. The probability that the station will successfully transmit or retransmit a message is (1 − p)N −1 .
This is because for the station to successfully transmit or retransmit its message, none of the
other stations should transmit messages at the same instance. The number of transmissions
and retransmissions of a message until the success is geometric with parameter (1 − p)N −1 .
Therefore, on average, the number of transmissions and retransmissions is 1/(1 − p)N −1 . 102 Chapter 5 Special Discrete Distributions 20. If the ﬁfth tail occurs after the 14th trial, ten or more heads have occurred. Therefore, the ﬁfth
tail occurs before the tenth head if and only if the ﬁfth tail occurs before or on the 14th ﬂip.
Calling tails success, X, the number of ﬂips required to get the ﬁfth tail is negative binomial
with parameters 5 and 1/2. The desired probability is given by
14 14 P (X = n) =
n=5 n=5 n−1
4 1
2 5 1
2 n−5 ≈ 0.91. 21. The probability of a straight is
10 45 − 40
= 0.003924647.
52
5
Therefore, the expected number of poker hands required until the ﬁrst straight is
1/0.003924647 = 254.80. 22. (a) Since P (X = n − 1)
1
=
> 1,
P (X = n)
1−p
P (X = n) is a decreasing function of n; hence its maximum is at n = 1.
(b) The probability that X is even is given by
∞ ∞ P (X = 2k) =
k =1 k =1 p(1 − p)2k−1 = p(1 − p)
1−p
.
=
2
1 − (1 − p)
2−p (c) We want to show the following:
Let X be a discrete random variable with the set of possible values 1, 2, 3 . . . .
If for all positive integers n and m,
P (X > n + m  X > m) = P (X > n), (17) then X is a geometric random variable. That is, there exists a number p ,
0 < p < 1, such that
P (X = n) = p(1 − p)n−1 . (18) To prove this, note that (17) implies that for all positive integers n and m,
P (X > n + m)
= P (X > n).
P (X > m)
Therefore,
P (X > n + m) = P (X > n)P (X > m). (19) Section 5.3 Other Discrete Random Variables 103 Let p = P (X = 1); using induction, we prove that (18) is valid for all positive integers n. To
show (18) for n = 2, note that (19) implies that
P (X > 2) = P (X > 1)P (X > 1).
Since P (X > 1) = 1 − P (X = 1) = 1 − p , this relation gives
1 − P (X = 1) − P (X = 2) = (1 − p)2 ,
or
1 − p − P (X = 2) = (1 − p)2 ,
which yields
P (X = 2) = p(1 − p),
so (18) is also true for n = 2. Now assume that (18) is valid for all positive integers i , i ≤ n;
that is, assume that
P (X = i) = p(1 − p)i −1 , i ≤ n. (20) We will show that (18) is true for n + 1. The induction hypothesis [relation (20)] implies that
n n p(1 − p)i −1 = p P (X = i) = P (X ≤ n) =
i =1 i =1 1 − (1 − p)n
= 1 − (1 − p)n .
1 − (1 − p) So P (X > n) = (1 − p)n and, similarly, P (X > n − 1) = (1 − p)n−1 . Now (19) yields
P (X > n + 1) = P (X > n)P (X > 1),
which implies that
1 − P (X ≤ n) − P (X = n + 1) = (1 − p)n (1 − p).
Substituting P (X ≤ n) = 1 − (1 − p)n in this relation, we obtain
P (X = n + 1) = p(1 − p)n ,
which establishes (18) for n + 1. Therefore, we have what we wanted to show. 23. Consider a coin for which the probability of tails is 1 − p and the probability of heads is p.
In successive and independent ﬂips of the coin, let X1 be the number of ﬂips until the ﬁrst
head, X2 be the total number of ﬂips until the second head, X3 be the total number of ﬂips
until the third head, and so on. Then the length of the ﬁrst character of the message and X1
are identically distributed. The total number of the bits forming the ﬁrst two characters of
the message and X2 are identically distributed. The total number of the bits forming the ﬁrst
three characters of the message and X3 are identically distributed, and so on. Therefore, the
total number of the bits forming the message has the same distribution as Xk . This is negative
binomial with parameters k and p . 104 Chapter 5 Special Discrete Distributions 24. Let X be the number of cartons to be opened before ﬁnding one without rotten eggs. X is not a
geometric random variable because the number of cartons is limited, and one carton not having
rotten eggs is not independent of another carton not having rotten eggs. However, it should be
1000
1200
obvious that a geometric random variable with parameter p =
= 0.1109 is
12
12
a good approximation for X . Therefore, we should expect approximately 1/p = 1/0.1109 =
9.015 cartons to be opened before ﬁnding one without rotten eggs. 25. Either the N th success should occur on the (2N − M)th trial or the N th failure should occur
on the (2N − M)th trial. By symmetry, the answer is
2· 2N − M − 1
N −1 1
2 N 1
2 N −M = 2N − M − 1
N −1 1
2 2N −M −1 . 26. The desired quantity is 2 times the probability of exactly N successes in (2N − 1) trials and
failures on the (2N)th and (2N + 1)st trials:
2 2N − 1
N 1
2 N 1− 1
2 (2N −1)−N · 1− 1
2 2 2N − 1
N = 1
2 2N . 27. Let X be the number of rolls until Adam gets a six. Let Y be the number of rolls of the die
until Andrew rolls an odd number. Since the events (X = i), 1 ≤ i < ∞, form a partition of
the sample space, by Theorem 3.4,
∞ ∞ P Y >XX=i P X=i = P Y >X =
i =1
∞ =
i =1 P Y >i P X=i
i =1 1
2 i 5
·
6 i −1 61
1
=·
6
56 ∞
i =1 5
12 i 1
=·
5 5
12 1
=,
7
5
1−
12 where P (Y > i) = (1/2)i since for Y to be greater than i , Andrew must obtain an even number
on each of the the ﬁrst i rolls. 28. The probability of 4 tagged trout among the second 50 trout caught is
pn = 50
4 n − 50
46
.
n
50 It is logical to ﬁnd the value of n for which pn is maximum. (In statistics this value is called
the maximum likelihood estimate for the number of trout in the lake.) To do this, note that
(n − 50)2
pn
=
.
pn−1
n(n − 96) Section 5.3 Other Discrete Random Variables 105 Now pn ≥ pn−1 if and only if (n − 50)2 ≥ n(n − 96), or n ≤ 625. Therefore, n = 625 makes
pn maximum, and hence there are approximately 625 trout in the lake. 29. (a) Intuitively, it should be clear that the answer is D/N . To prove this, let Ej be the event of
obtaining exactly j defective items among the ﬁrst (k − 1) draws. Let Ak be the event that the
k th item drawn is defective. We have
k −1 P (Ak ) = k −1 P (Ak  Ej )P (Ej ) =
j =0 j =0 Now
(D − j )
and
(N − k + 1) D−j
·
N −k+1 D
j N −D
k−1−j
.
N
k−1 D
D−1
=D
j
j
N
N −1
=N
.
k−1
k−1 Therefore,
k −1 P (Ak ) = D D−1
j j =0 N −D
k−1−j
N −1
N
k−1 where
k −1
j =0 D−1
j = D
N k −1
j =0 N −D
k−1−j
N −1
k−1 D−1
j D−1
j N −D
k−1−j
N −1
k−1 = D
,
N =1 N −D
k−1−j
since
is the probability mass function of a hypergeometric random
N −1
k−1
variable with parameters N − 1, D − 1, and k − 1.
(b) Intuitively, it should be clear that the answer is (D − 1)/(N − 1). To prove this, let Ak be
as before and let Fj be the event of exactly j defective items among the ﬁrst (k − 2) draws.
Let B be the event that the (k − 1)st and the k th items drawn are defective. We have
k −2 P (B) = P (B  Fj )P (Fj )
j =0 106 Chapter 5 Special Discrete Distributions k −2 =
j =0 k −2 (D − j )(D − j − 1)
·
(N − k + 2)(N − k + 1) N −D
k−2−j
N
k−2 N −D
k−2−j
N −2
N (N − 1)
k−2 D(D − 1) =
j =0 D(D − 1)
=
N(N − 1)
= D
j k −2 D−2
j D−2
j j =0 D(D − 1)
.
N(N − 1) N −D
k−2−j
N −2
k−2 Using this, we have that the desired probability is
D(D − 1)
N (N − 1)
P (B)
P (Ak Ak−1 )
D−1
=
=
.
P (Ak  Ak−1 ) =
=
P (Ak−1 )
P (Ak−1 )
N −1
D
N REVIEW PROBLEMS FOR CHAPTER 5
20 1.
i =12 20
(0.25)i (0.75)20−i = 0.0009.
i 2. N(t), the number of customers arriving at the post ofﬁce at or prior to t is a Poisson process
with λ = 1/3. Thus
6 6 P N (30) = i = P N (30) ≤ 6 =
i =0 3. 4 ·
2 4.
i =0 8
= 1.067.
30
12
(0.30)i (0.70)12−i = 0.253.
i i =0 e−(1/3)30 (1/3)30
i! i = 0.130141. Chapter 5 5. Review Problems 107 5
(0.18)2 (0.82)3 = 0.179.
2
1999 i−1
2−1 6.
i =2 160
i 12 7.
i =7 1
1000 200
12 − i
360
12 2 999
1000 i −2 = 0.59386. = 0.244. 8. Call a train that arrives between 10:15 A.M. and 10:28 A.M. a success. Then p, the probability
of success is
p= 28 − 15
13
=
.
60
60 Therefore, the expected value and the variance of the number of trains that arrive in the given
period are 10(13/60) = 2.167 and 10(13/60)(47/60) = 1.697, respectively. 9. The number of checks returned during the next two days is Poisson with λ = 6. The desired
probability is
4 P (X ≤ 4) =
i =0 e−6 6i
= 0.285.
i! 10. Suppose that 5% of the items are defective. Under this hypothesis, there are 500(0.05) = 25
defective items. The probability of two defective items among 30 items selected at random is
25
2 475
28
500
30 = 0.268. Therefore, under the above hypothesis, having two defective items among 30 items selected
at random is quite probable. The shipment should not be rejected. 11. N is a geometric random variable with p = 1/2. So E(N) = 1/p = 2, and Var(N ) =
(1 − p)/p2 = 1 − (1/2) /(1/4) = 2. 12. 5
6 5 1
= 0.067.
6 13. The number of times a message is transmitted or retransmitted is geometric with parameter
1 − p . Therefore, the expected value of the number of transmissions and retransmissions of a 108 Chapter 5 Special Discrete Distributions message is 1/(1 − p). Hence the expected number of retransmissions of a message is
p
1
−1=
.
1−p
1−p 14. Call a customer a “success,” if he or she will make a purchase using a credit card. Let E
be the event that a customer entering the store will make a purchase. Let F be the event that
the customer will use a credit card. To ﬁnd p, the probability of success, we use the law of
multiplication:
p = P (EF ) = P (E)P F  E = (0.30)(0.85) = 0.255.
The random variable X is binomial with parameters 6 and 0.255. Hence
6
i P X=i = 0.255 i 1 − 0.255 6−i , i = 0, 1, . . . , 6. Clearly, E(X) = np = 6(0.255) = 1.53 and
Var(X) = np(1 − p) = 6(0.255)(1 − 0.255) = 1.13985.
5 15. 18
i i =3 10
5−i
28
5 = 0.772. 16. By the formula for the expected value of a hypergeometric random variable, the desired quantity
is (5 × 6)/16 = 1.875. 17. We want to ﬁnd the probability that at most 4 of the seeds do not germinate:
4
i =0
2 18. 1 −
i =0 40
(0.06)i (0.94)40−i = 0.91.
i 20
(0.06)i (0.94)20−i = 0.115.
i Let X be the number of requests for reservations at the end of the second day. It is reasonable
to assume that X is Poisson with parameter 3 × 3 × 2 = 18. Hence the desired probability is
23 23 P (X = i) = 1 − P (X ≥ 24) = 1 −
i =0 i =0 e−18 (18)i
= 1 − 0.89889 = 0.10111.
i! Chapter 5 Review Problems 109 19. Suppose that the company’s claim is correct. Then the probability of 12 or less drivers using
seat belts regularly is
12 20
(0.70)i (0.30)20−i ≈ 0.228.
i i =0 Therefore, under the assumption that the company’s claim is true, it is quite likely that out of
20 randomly selected drivers, 12 use seat belts. This is not a reasonable evidence to conclude
that the insurance company’s claim is false. 20. (a) (0.999)999 (0.001)1 = 0.000368. (b) 2999
(0.001)3 (0.999)2997 = 0.000224.
2 21. Let X be the number of children having the disease. We have that the desired probability is P (X = 3  X ≥ 1) = 22. (a) w
w+b n−1 b
.
w+b P (X = 3)
=
P (X ≥ 1)
w
w+b (b) n−1 5
(0.23)3 (0.77)2
3
= 0.0989.
1 − (0.77)5 . 23. Let n be the desired number of seeds to be planted. Let X be the number of seeds which
will germinate. We have that X is binomial with parameters n and 0.75. We want to ﬁnd the
smallest n for which
P (X ≥ 5) ≥ 0.90.
or, equivalently,
P (X < 5) ≤ 0.10.
That is, we want to ﬁnd the smallest n for which
4
i =0 n
(0.75)i (.25)n−i ≤ 0.10.
i By trial and error, as the following table shows, we ﬁnd that the smallest n satisfying
P (X < 5) ≤ 0.10 is 9. So at least nine seeds is to be planted.
n
5
6
7
8
9 4
n
i =0 i (0.75)i (.25)n−i
0.7627
0.4661
0.2436
0.1139
0.0489 110 Chapter 5 Special Discrete Distributions 24. Intuitively, it must be clear that the answer is k/n. To prove this, let B be the event that the i th
baby born is blonde. Let A be the event that k of the n babies are blondes. We have P (B  A) = P (AB)
=
P (A) p· n − 1 k−1
p (1 − p)n−k
k−1
=
nk
p (1 − p)n−k
k n−1
k−1
n
k = k
.
n 25. The size of a seed is a tiny fraction of the size of the area. Let us divide the area up into many
small cells each about the size of a seed. Assume that, when the seeds are distributed, each
of them will land in a single cell. Accordingly, the number of seeds distributed will equal
the number of nonempty cells. Suppose that each cell has an equal chance of having a seed
independent of other cells (this is only approximately true). Since λ is the average number of
seeds per unit area, the expected number of seeds in the area, A, is λA. Let us call a cell in
A a “success” if it is occupied by a seed. Let n be the total number of cells in A and p be
the probability that a cell will contain a seed. Then X, the number of cells in A with seeds
is a binomial random variable with parameters n and p. Using the formula for the expected
number of successes in a binomial distribution (= np ), we see that np = λA and p = λA/n.
As n goes to inﬁnity, p approaches zero while np remains ﬁnite. Hence the number of seeds
that fall on the area A is a Poisson random variable with parameter λA and
P (X = i) = e−λA (λA)i
.
i! 26. Let D/N → p, then by the Remark 5.2, for all n,
D
x N −D
n−x
N
n ≈ nx
p (1 − p)n−x .
x Now since n → ∞ and nD/N → λ, n is large and np is appreciable, thus
e−λ λx
nx
.
p (1 − p)n−x ≈
x!
x Chapter 6 C ontinuous R andom
Variables
6.1 PROBABILITY DENSITY FUNCTIONS
∞ 1. (a) ce−3x dx = 1 ⇒ c = 3. 0 1/2 (b) P (0 < X ≤ 1/2) = 3e−3x dx = 1 − e−3/2 ≈ 0.78. 0 ⎧
⎨ 32
2. (a) f (x) = x 3
⎩
0 x≥4
x < 4. (b) P (X ≤ 5) = 1 − (16/25) = 9/25,
P (X ≥ 6) = 16/36 = 4/9,
P (5 ≤ X ≤ 7) = 1 − (16/49) − 1 − (16/25) = 0.313,
P (1 ≤ X < 3.5) = 0 − 0 = 0.
2 3. (a) c(x − 1)(2 − x) dx = 1 1 ⇒c − x (b) F (x) = x 3 3x 2
+
− 2x
3
2 2
1 = 1 ⇒ c = 6. 6(x − 1)(2 − x) dx, 1 ≤ x < 2. Thus
1 ⎧
⎪0
⎨
F (x) = −2x 3 + 9x 2 − 12x + 5
⎪
⎩
1 x<1
1≤x<2
x ≥ 2. (c) P (X < 5/4) = F (5/4) = 5/32,
P (3/2 ≤ X ≤ 2) = F (2) − F (3/2) = 1 − (1/2) = 1/2.
1.5 4. (a) P (X < 1.5) =
1 2
2
dx = .
2
x
3 112 Chapter 6 Continuous Random Variables
1.25 (b) P (1 < X < 1.25  X < 1.5) = 1
1.5
1 1 5. (a)
−1 2
dx
x2
2
dx
x2 c
dx = 1 ⇒ c · arcsin x
√
1 − x2 1
−1 = 2 /5
3
=.
2/3
5 = 1 ⇒ c = 1/π. (b) For −1 < x < 1,
x F (x) = −1 1
π 1 − x2 dx = ⎧
⎪0
⎪
⎪
⎪
⎨
1
1
F (x) =
arcsin x +
⎪π
2
⎪
⎪
⎪
⎩
1 Thus 1
1
arcsin x + .
π
2
x < −1
−1 ≤ x < 1
x ≥ 1. 6. Since h(x) ≥ 0 and
∞
α 1
f (x)
dx =
1 − F (α)
1 − F (α) ∞ f (x) dx = α 1
1 − F (α) = 1,
1 − F (α) h is a probability density function. 7. (a) Let F be the distribution function of X. Then X is symmetric about α if and only if for all
x , 1 − F (α + x) = F (α − x), or upon differentiation f (α + x) = f (α − x).
(b) f (α + x) = f (α − x) if and only if (α − x − 3)2 = (α + x − 3)2 . This is true for all x, if
and only if α − x − 3 = −(α + x − 3) which gives α = 3. A similar argument shows that g
is symmetric about α = 1.
∞ 8. (a) Since f is a probability density function,
−∞
∞
−∞ f (x) dx = 0
−1 k(2x − 3x 2 ) dx = k 0
−1 f (x) dx = 1. But (2x − 3x 2 ) dx = k x 2 − x 3 0
−1 = −2k. So −2k = 1 or k = −1/2.
(b) The loss is at most $500 if and only if X ≥ −1/2. Therefore, the desired probability is
P X≥− 1
=
2 0 1
1
− (2x − 3x 2 ) dx = − x 2 − x 3
2
2
−1/2 0
−1/2 = 3
.
16 Section 6.2 9. P (X > 15) = ∞
15 Density Function of a Function of a Random Variable 113 1 −x/15
1
e
dx = . Thus the answer is
15
e
8
i =4 8
i 1
e i 1− 1
e 8−i = 0.3327. 10. Since αf + βg ≥ 0 and
∞
−∞ αf (x) + βg(x) d x = α ∞
−∞ f (x) dx + β ∞
−∞ g(x) dx = α + β = 1, αf + βg is also a probability density function. 11. Since F (−∞) = 0 and F (∞) = 1, We have that
α + β(−π/2) = 0
α + β(π/2) = 1.
Solving this system of two equations in two unknown, we obtain α = 1/2 and β = 1/π. Thus
f (x) = F (x) = 6.2 2
, −∞ < x < ∞.
π(4 + x 2 ) DENSITY FUNCTION OF A FUNCTION OF A RANDOM VARIABLE 1. Let G be the distribution function of Y ; for −8 < y < 8,
G(y) = P (Y ≤ y) = P (X ≤ y) = P (X ≤
3 Therefore, This gives ⎧
⎪0
⎪
⎪
⎪
⎨
1
1√
3
G(y) =
y+
⎪4
2
⎪
⎪
⎪
⎩1 √
3 √
3y y)= −2 1
1√
1
dx = 3 y + .
4
4
2 y < −8
−8 ≤ y < 8
y ≥ 8. ⎧
⎪ 1 −2/3
⎨y
g(y) = G (y) = 12
⎪
⎩0 −8 < y < 8
otherwise. 114 Chapter 6 Continuous Random Variables Let H be the distribution function of Z ; for 0 ≤ z < 16,
√
√
H (z) = P (X ≤ z) = P (− 4 z ≤ x ≤ 4 z ) = √
4z 4 Thus This gives ⎧
⎪0
⎪
⎪
⎪
⎨√
14
H (z) =
z
⎪2
⎪
⎪
⎪
⎩1 √
4 −z 1√
1
dx = 4 z.
4
2 z<0
0 ≤ z < 16
z ≥ 16. ⎧
⎪ 1 −3/4
⎨z
h(z) = H (z) = 8
⎪
⎩0 0 < z < 16
otherwise. 2. Let G be the probability distribution function of Y and g be its probability density function.
For t > 0,
G(t) = P eX ≤ t = P (X ≤ ln t) = F (ln t).
For t ≤ 0, G(t) = 0. Therefore,
⎧
⎪1
⎨ f (ln t) t > 0
g(t) = G (t) = t
⎪
⎩0
t ≤ 0.
√ 3. The set of possible values of X is A = (0, ∞). Let h : (0, ∞) → R be deﬁned by h(x) = x x.
The set of possible values of h is B = (0, ∞). The inverse of h is g , where g(y) = y 2/3 . Thus
√
g (y) = 2/(3 3 y ) and hence
2/3
2
fY (y) = √ e−y , y ∈ (0, ∞).
3y
3 To ﬁnd the probability density function of e−X , let h : (0, ∞) → R be deﬁned by h(x) = e−x ;
h is an invertible function with the set of possible values B = (0, 1). The inverse of h is
g(z) = − ln z. So g (z) = −1/z. Therefore,
fZ (z) = e−(− ln z) −
0, otherwise. 1
1
= z · = 1, z ∈ (0, 1);
z
z Section 6.2 Density Function of a Function of a Random Variable 115 4. The set of possible values of X is A = (0, ∞). Let h : (0, ∞) → R be deﬁned by h(x) =
log2 x. The set of possible values of h is B = (−∞, ∞). h is invertible and its inverse is
g(y) = 2y , where g (y) = (ln 2)2y . Thus
y
y
fY (y) = 3e−3 2 (ln 2)2y = (3 ln 2)2y e−3(2 ) , y ∈ (−∞, ∞). 5. Let G and g be the probability distribution and the probability density functions of Y , respectively. Then
G(y) = P (Y ≤ y) = P
√
yy = √
√
3
X2 ≤ y = P (X ≤ y y ) λe−λx dx = 1 − e−λy √
y , y ∈ [0, ∞). 0 So
g(y) = G (y) = 3λ √ −λy √y
ye
, y ≥ 0;
2 0, otherwise. 6. Let G and g be the probability distribution and density functions of X2 , respectively. For
t ≥ 0, √
√
√
√
G(t) = P (X2 ≤ t) = P (− t < X < t ) = F ( t ) − F (− t ). Thus
√
√
√
√
1
1
1
g(t) = G (t) = √ f ( t ) + √ f (− t ) = √ f ( t ) + f (− t ) , t ≥ 0.
2t
2t
2t
For t < 0, g(t) = 0. 7. Let G and g be the distribution and density functions of Z , respectively. For −π/2 < z < π/2,
G(z) = P (arctan X ≤ z) = P (X ≤ tan z) =
1
=
arctan x
π
Thus tan z
−∞ tan z
−∞ 1
dx
π(1 + x 2 ) 1
1
= z+ .
π
2 ⎧
⎪1
⎨
g(z) = π
⎪
⎩0 − π
π
<z<
2
2 elsewhere. 8. Let G and g be distribution and density functions of Y , respectively. Then
G(t) = P (Y ≤ t) = P (Y ≤ t  X ≤ 1)P (X ≤ 1) + P (Y ≤ t  X > 1)P (X > 1)
1
X > 1 P (X > 1).
= P (X ≤ t  X ≤ 1)P (X ≤ 1) + P X ≥
t 116 Chapter 6 Continuous Random Variables For t ≥ 1, this gives
1 G(t) = 1 · ∞ e−x dx + 1 · 0 e−x dx = 1. 1 For 0 < t < 1, this gives
G(t) = P (X ≤ t) + P X ≥
Hence t ∞ e−x dx + 0 e−x dx = 1 − e−t + e−1/t . 1/t ⎧
⎪0
⎨
G(t) = 1 − e−t + e−1/t
⎪
⎩
1 Therefore, 6.3 1
=
t t ≤0
0<t <1
t ≥ 1. ⎧
⎪ −t
⎨e + 1 e−1/t
t2
g(t) = G (t) =
⎪
⎩0 0<t <1
elsewhere. EXPECTATIONS AND VARIANCES 1. The probability density function of X is f (x) = ⎧
⎨32/x 3 x≥4 ⎩0 x < 4. Thus ∞ 32
dx = 8.
x2
4
∞
32
dx = ∞; so Var(X) = E(X2 ) − E (X)
(b) E(X2 ) =
x
4 (a) E(X) = 2 2 does not exist. 3
.
2
1
2
23 9
1
23
1
; so Var(X) =
−=
, and σX = √ .
(−x 4 + 3x 3 − 2x 2 ) dx =
(b) E(X2 ) = 6
10
10 4
20
20
1 2. (a) E(X) = 6 (−x 3 + 3x 2 − 2x) dx = 3. The standardized value of the lifetime of a car mufﬂr manufactured by company A is
e
(4.25 − 5)/2 = −0.375. The corresponding value for company B is (3.75 − 4)/1.5 = −0.167.
Therefore, the mufﬂr of company B has performed relatively better.
e
∞ 4. E eX =
0 ∞ ex (3e−3x ) dx =
0 3e−2x dx = 3/2. Section 6.3 5. E(X) = 1
−1 Expectations and Variances 117 x
dx = 0, because the integrand is an odd function.
√
π 1 − x2 6. Let f be the probability density function of Y . Clearly, f (y) = F (y) = ⎧
⎪ k e−k(α−y)/A
⎪
⎨A
⎪
⎪
⎩ −∞ < y ≤ α 0 y > α. Therefore,
α E(Y ) = −∞ k
k −k(α−y)/A
A ky/A A2 ky/A
dy = e−kα/A
− 2e
ye
ye
A
A
k
k α
−∞ =α− A
.
k 7. Let H be the distribution function of C ; then
P (F ≤ t) = P C ≤ t − 32
t − 32
=H
.
1.8
1.8 Hence the probability density function of F is
1
t − 32
5 t − 32
d
P (F ≤ t) =
h
=h
.
dt
1.8
1.8
9
1.8
The expected value of F is given by
E(F ) = 1.8E(C) + 32 = 1.8 ∞
−∞ xh(x) dx + 32. 2 2 ln x
dx. To calculate this integral, let U = ln x , dV = 1/x 2 , and use
x2
1
integration by parts: 8. E(ln X) = 2
1 2 ln x
2 ln x
dx = −
2
x
x 2 2 −
1 − 1 2
dx = 1 − ln 2 = 0.3069.
x2 9. The expected value of the length of the other side is given by
E 4 81 − X 2 = 81 − x 2 · 2 x
dx.
6 Letting u = 81 − x 2 , we get du = −2x dx and
E 81 − X 2 = 1
12 77
65 √ u du ≈ 8.4. 118 Chapter 6 Continuous Random Variables
∞ 10. E(X) = 1 −x 
xe
dx = 0, because the integrand is an odd function. Now
2 −∞ ∞ E(X2 ) = −∞ ∞ 1 2 −x 
xe
dx =
2 x 2 e−x dx 0 since the integrand is an even function; applying integration by parts to the last integral twice,
we obtain E(X2 ) = 2. Hence Var(X) = 2 − 02 = 2. 11. Note that
E X α = ∞
−∞ x α
2
dx =
2)
π(1 + x
π ∞
0 xα
dx
(1 + x 2 ) since the integrand is an even function. Now for 0 < α < 1,
∞
0 xα
dx =
1 + x2 1
0 xα
dx +
1 + x2 ∞
1 xα
dx.
1 + x2 Clearly, the ﬁrst integral in the right side is convergent. To show that the second one is also
convergent, note that.
xα
1
xα
≤ 2 = 2−α .
2
1+x
x
x
Therefore,
∞
1 xα
dx ≤
1 + x2 ∞
1 1
x 2−α 1
(α − 1)x 1−α dx = ∞
1 = 1
< ∞.
1−α For α ≥ 1,
∞
0
∞ So
0 ∞ xα
≥
1 + x2 1 xα
dx ≥
1 + x2 ∞
1 1
x
dx =
ln(1 + x 2 )
1 + x2
2 ∞
1 = ∞. xα
dx diverges.
1 + x2 12. By Remark 6.4,
∞ E(X) =
0 ∞ P (X > t) dt = (αe−λt + βe−µt ) dt = 0 13. (a) c1 is an arbitrary positive number because ∀c1 ,
.
1 implies that cn = n−1/(n−1)⎧
⎨∞
∞
cn
(b) E(Xn ) =
dx =
n
⎩n(n−2)/(n−1) /(n − 1)
cn x ∞
c1 β
α
+.
λµ c1
dx = 1. For n > 1,
x2 ∞
cn cn
x n+1 dx = if n = 1
if n > 1.
et (c) P (Zn ≤ t) = P (ln Xn ≤ t) = P (Xn ≤ et ) =
cn cn
cn 1
1
dx =
− nt , where
n
x n+1
n cn
e Section 6.3 Expectations and Variances 119 cn = n−1/(n−1) . Let gn be the probability density function of Zn . Then gn (t) = cn e−nt ,
t ≥ ln cn .
m
(d) E(Xn +1 ) = ∞
cn cn x m+1
dx. This integral exists if and only if m − n < −1.
x n+1 14. Using integration by parts twice, we obtain
E(X n+1 ) = 1
π π x n+2 sin x dx = π n+1 + (n + 2) 0 1
π π x n+1 cos x dx 0 1 πn
x sin x dx
π0
= π n+1 + (n + 2) − (n + 1)E(X n−1 ) .
= π n+1 + (n + 2) − (n + 1) Hence E(Xn+1 ) + (n + 1)(n + 2)E(X n−1 ) = π n+1 . 15. Since X is symmetric about α , for all x ∈ (−∞, ∞), f (α + x) = f (α − x). Letting y = x + α ,
we have
E(X) = = ∞
−∞
∞
−∞ yf (y) dy = ∞
−∞ (x + α)f (x + α) dx
∞ xf (x + α) dx + α −∞ f (x + α) dx. Now since f is symmetric about α , xf (x + α) is an odd function,
−xf (−x + α) = − xf (x + α) .
∞ Therefore,
−∞ xf (x + α) = 0. Since E(X) = 0 + α · 1 = α. ∞
−∞ f (x + α) dx = ∞
−∞ f (y) dy = 1, we have To show that the median of X is α , we will show that P (X ≤ α) = P (X ≥ α). This also
shows that the value of these two probabilities is 1/2. Letting u = α − x , we have
∞ α P (X ≤ α) = −∞ f (x) dx = f (α − u) du. 0 Letting u = x − α , we have that
∞ P (X ≥ α) =
α ∞ f (x) dx =
0 f (u + α) du. 120 Chapter 6 Continuous Random Variables Since for all u,
f (α − u) = f (α + u),
we have that
P (X ≤ α) = P (X ≥ α) = 1/2. 16. By Theorem 6.3,
∞ E X − y  = −∞ y =y ∞ y x − y f (x)dx = −∞ −∞ (y − x)f (x) dx +
∞ y f (x) dx − −∞ (x − y)f (x) dx y xf (x) dx + ∞ xf (x) dx − y y f (x) dx.
y Hence
dE X − y 
=
dy −∞ f (x) dx + yf (y) − yf (y) − yf (y) − Setting −∞ f (x) dx + yf (y) y
∞ y = ∞ y f (x) dx − f (x) dx.
y dE X − y 
= 0, we obtain that y is the solution of the following equation:
dy
∞ y
−∞ f (x) dx = f (x) dx.
y By the deﬁnition of the median of a continuous random variable, the solution to this equation
is y = median(X). Hence E X − y  is minimum for y = median(X).
∞ 17. (a) ∞ X I (t) dt = 0 I (t) dt +
0 ∞ X I (t) dt = dt +
0 X 0 dt = X. X ∞ I (t) dt is a random variable.) (Note that
0 ∞ (b) E(X) = E ∞ I (t) dt = 0 ∞ E I (t) d t = 0 ∞ P (X > t) dt = 0 0 (c) By part (b),
∞ E(Xr ) =
0 0 P X> √
r
t dt 0
∞ = ∞ P (Xr > t) dt = 1−F √
r t ∞ dt = r
0 where the last equality follows by the substitution y = y r −1 1 − F (y) d y, √
r
t. 1 − F (t) d t. Section 6.3 Expectations and Variances 121 18. On the interval [n, n + 1),
P X  ≥ n + 1 ≤ P X > t ≤ P X ≥ n .
Therefore,
n+1 n+1 P X  ≥ n + 1 d t ≤ n n+1 P X > t d t ≤ n or n+1 P X  ≥ n + 1 ≤ P X ≥ n d t, n P X > t d t ≤ P X ≥ n . n So ∞ ∞ n+1 P X  ≥ n + 1 ≤
n=0 n=0 and hence ∞ P X > t d t ≤ n P X > n ,
n=0 ∞ ∞ P X  ≥ n ≤ E X ≤ 1 +
n=1 P X ≥ n .
n=1 19. By Exercise 12,
E(X) = α
β
+.
λµ Using Exercise 16, we obtain
∞ E(X2 ) = 2 x(αe−λx + βe−µx ) dx = 0 2α 2β
+ 2.
λ2
µ Hence
Var(X) = 2α 2β
α
β
+2−
+
2
λ
µ
λµ 2 = 2α − α 2 2β − β 2 2αβ
+
−
.
λ2
µ2
λµ 20. X ≥st Y implies that for all t ,
P (X > t) ≥ P (Y > t). (21) Taking integrals of both sides of (21) yields,
∞
0 ∞ P (X > t) dt ≥ P (Y > t) dt.
0 Relation (21) also implies that
1 − P (X ≤ t) ≥ 1 − P (Y ≤ t),
or, equivalently,
P (X ≤ t) ≤ P (Y ≤ t)· (22) 122 Chapter 6 Continuous Random Variables Since this is true for all t , we have
P (X ≤ −t) ≤ P (Y ≤ −t)·
Taking integrals of both sides of this inequality, we have
∞ ∞ P (X ≤ −t) ≤ 0 P (Y ≤ −t) dt, 0 or, equivalently,
∞ − ∞ P (X ≤ −t) ≥ − 0 P (Y ≤ −t) dt. (23) 0 Adding (22) and (23) yields
∞
0 ∞ P (X > t) dt − ∞ P (X ≤ −t) dt ≥ 0 ∞ P (Y > t) dt − 0 P (Y ≤ −t) dt · 0 By Theorem 6.2, this gives E(X) ≥ E(Y ). To show that the converse of this theorem is false,
let X and Y be discrete random variables both with set of possible values {1, 2, 3}. Let the
probability mass functions of X and Y be deﬁned by
pX (1) = 0.3 pX (2) = 0.4 pX (3) = 0.3 pY (1) = 0.5 pY (2) = 0.1 pY (3) = 0.4 We have that E(X) = 2 > E(Y ) = 1.9. However, since
P (X > 2) = 0.3 < P (Y > 2) = 0.4,
we see that X is not stochastically larger than Y . 21. First, we show that limx →−∞ xP X ≤ x = 0. To do so, since x → −∞, we concentrate on
negative values of x . Letting u = −t , we have ∞ x xP X ≤ x = x −∞ f (t) dt = x So it sufﬁces to show that as x → −∞,
∞
−x −∞ ∞
−x ∞
−x −xf (−u) du. −xf (−u) du → 0. Now
∞ −xf (−u) du Therefore, it remains to prove that
∞ ∞
−x −x f (−u) du = − −x uf (−u) du. uf (−u) du → 0 as x → −∞. But this is true because uf (−u) du = ∞
−∞ x f (x) dx < ∞. Chapter 6 Review Problems Next, we will show that limx →∞ xP X > x = 0. To do so, note that
∞ lim xP X > x = lim x x →∞ since ∞
−∞ x →∞ ∞ f (t) dt ≤ lim x →∞ x tf (t) dt = 0 x tf (t) dt < ∞. REVIEW PROBLEMS FOR CHAPTER 6
1. Let F be the distribution function of Y . Clearly, F (y) = 0 if y ≤ 1. For y > 1,
1
1
1
1
y
F (y) = P
≤y =P X≥
=
=1− .
X
y
1−0
y
1− So
f (y) = F (y) = ⎧
⎨1/y 2 y>1 ⎩0 elsewhere. ∞
2
2
2∞
dx =
dx = −
= 2,
x3
x2
x1
1
1
∞
∞
2
x 2 · 3 dx = 2 ln x = ∞. So Var(X) does not exist.
E(X2 ) =
1
x
1 2. E(X) = ∞ x· 1 1
6
1
(6x 2 − 6x 3 ) dx = 2x 3 − x 4 = ,
0
4
2
0
1
1
6
6
3
E(X2 ) =
,
(6x 3 − 6x 4 ) dx = x 4 − x 5 =
0
4
5
10
0
3
1
1
12
Var(X) =
=
−
, σX = √ .
10
2
20
25 3. E(X) = Therefore,
1
1
2
2
P
− √ <X< + √ =
2 25
2 25 1√
1
2+ 5
1√
1
2− 5 (6x − 6x 2 ) dx = 3x − 2x
2 3 1√
1
2+ 5
1√
1
2− 5 11
= √.
55 123 124 Chapter 6 Continuous Random Variables 4. We have that
0
e−x 
1
dx =
ex dx +
2 −2
−2 2
1
1
=1−
− 2 = 0.748.
2e 2e P (−2 < X < 1) = 5. For all c > 0, 1 1 e−x dx 0 ∞ ∞
c
dx = c ln(1 + x)
= ∞.
0
1+x
0
So, for no value of c, f (x) is a probability density function. 6. The set of possible values of X is A = [1, 2]. Let h : [1, 2] → R be deﬁned by h(x) = ex . The set of possible values of eX is B = [e, e2 ]; the inverse of h is g(y) = ln y , where g (y) = 1/y.
Therefore,
4(ln y)3
4(ln y)3
fY (y) =
g (y) =
, y ∈ [e, e2 ].
15
15y
Applying the same procedure to Z and W , we obtain
√
2z
4( z )3 1
fZ (z) =
√ = , z ∈ [1, 4].
15
2z
15
√
2(1 + w )3
fW (w) =
w ∈ [0, 1].
√
15 w 7. The set of possible values of X is A = (0, 1). Let h : (0, 1) → R be deﬁned by h(x) = x 4 .
The set of possible values of X4 is B = (0, 1). The inverse of h(x) = x 4 is g(y) =
1
1
g (y) = y −3/4 = √ √ . We have that
4
4 y4y
√
√
fY (y) = 30( 4 y )2 (1 − 4 y )2
√
15(1 − 4 y )2
=
,
√
24y 8. We have that 1
4 4 y3 1
√
√
= 30 y(1 − 4 y )2 √ √
4 y4y y ∈ (0, 1). ⎧
⎪ √1
⎨
f (x) = F (x) = π 1 − x 2
⎪
⎩
0 Therefore,
E(X) =
since the integrand is an odd function. 1
−1 −1 < x < 1
otherwise. x
dx = 0
√
π 1 − x2 √
4 y . So Chapter 6 9. Clearly n
i =1 125 αi fi ≥ 0. Since
∞ n −∞
n
i =1 Review Problems i =1 ∞ n αi fi (x) dx = αi
i =1 −∞ n fi (x) dx = αi = 1,
i =1 αi fi is a probability density function. 10. Let U = x and dV = f (x)dx . Then dU = dx and V = F (x). Since F (α) = 1,
α xf (x) dx = x F (x) E(X) =
0 α
0 α − F (x) dx
0 α α F (x) dx = α − = αF (α) −
0
α α dx − =
0 F (x) dx
0 α 1 − F (x) d x. F (x) dx =
0 0 11. Let X be the lifetime of a random light bulb. The probability that it lasts over 1000 hours is
P (X > 1000) = ∞
1000 5 × 105
1
dx = 5 × 105 − 2
x3
2x ∞
1000 1
=.
4 Thus the probability that out of six such light bulbs two last over 1000 hours is
6
2 1
4 2 3
4 4 ≈ 0.3 12. Since Y ≥ 0, P (Y ≤ t) = 0 for t < 0. For t ≥ 0,
P (Y ≤ t) = P X  ≤ t = P (−t ≤ X ≤ t) = P (X ≤ t) − P (X < −t)
= P (X ≤ t) − P (X ≤ −t) = F (t) − F (−t).
Hence G, the probability distribution function of X  is given by
G(t) = F (t) − F (−t)
0 if t ≥ 0
if t < 0; g , the probability density function of X is obtained by differentiating G:
g(t) = G (t) = f (t) + f (−t)
0 if t ≥ 0
if t < 0. Chapter 7 Special C ontinuous
Distributions
7.1 UNIFORM RANDOM VARIABLES 1. (23 − 20)/(27 − 20) = 3/7.
2. 15(1/4) = 3.75.
3. Let 2:00 P.M. be the origin, then a and b satisfy the following system of two equations in two
⎧
⎪a + b = 0
⎪
⎨
2
⎪ (b − a)2
⎪
⎩
= 12.
12
Solving this system, we obtain a = −6 and b = 6. So the bus arrives at a random time
between 1:54 P.M. and 2:06 P.M. unknown. 4. P (b2 − 4 ≥ 0) = P (b > 2 or b < −2) = 2/6 = 1/3.
5. The probability density function of R , the radius of the sphere is
⎧
⎪ 1 =1
⎨
2
f (r) = 4 − 2
⎪
⎩0 Thus 4 E(V ) =
2 2<r<4
elsewhere. 4 31
πr
dr = 40π.
3
2 4
1
P π R 3 < 36π = P (R 3 < 27) = P (R < 3) = .
3
2 6. The problem is equivalent to choosing a random number X from (0, ). The desired probability
is
P X≤ 3 +P X ≥ 2
3 = /3 + − (2 /3) = 2
.
3 Section 7.1 127 Uniform Random Variables 7. Let X be a random number from (0, ). The probability of the desired event is
P min(X, − X) ≥ 8. =P X≥ ,
3
3 2
−X ≥
=P
≤X≤
3
3
3 = 2
−
3
3 1
=.
3 180 − 90
3
=.
180 − 60
4 9. Let X be a random point from (0, b). A triangular pen is possible to construct if and only if
the segments a , X , and b − X are sides of a triangle. The probability of this is
a+b
b−a
<X<
2
2
a+b b−a
−
a
2
2
=
=.
b
b P a < X + (b − X), X < a + (b − X), b − X < a + X = P 10. Let F be the probability distribution function and f be the probability density function of X.
By deﬁnition,
F (x) = P (X ≤ x) = P (tan θ ≤ x) = P (θ ≤ arctan x)
π
arctan x − −
1
1
2
=
= arctan x + , −∞ < x < ∞.
π
π
π
2
−−
2
2
Thus
f (x) = F (x) = 1
,
π(1 + x 2 ) −∞ < x < ∞. 11. For i = 0, 1, 2, . . . , n − 1,
i+1
i
≤X<
=
P [nX ] = i = P (i ≤ nX < i + 1) = P
n
n
P [nX ] = i = 0, otherwise.
0 , 1, 2 , . . . , n − 1 . i
i+1
−
1
n
n
=.
n
1−0 Therefore, [nX] is a random number from the set 12. (a) Let G and g be the distribution and density functions of Y , respectively. Since Y ≥ 0,
G(x) = 0 if x ≤ 0. If x ≥ 0, G(x) = P (Y ≤ x) = P − ln(1 − X) ≤ x = P X ≤ 1 − e−x
= (1 − e−x ) − 0
= 1 − e−x .
1−0 128 Chapter 7 Special Continuous Distributions Thus
g(x) = G (x) = e−x
0 x≥0
otherwise. (b) Let H and h be the probability distribution and probability density functions of Z , respectively. For n > 0, H (x) = P (Z ≤ x) = 0, x < 0;
√
√
H (x) = P (Z ≤ x) = P (X ≤ n x ) = n x, 0 < x < 1;
H (x) = 1, if x ≥ 1. Therefore, ⎧
⎪ 1 n −1
⎨ x1
h(x) = H (x) = n
⎪
⎩0 0<x<1
elsewhere. For n < 0, H (x) = P (Xn ≤ x) = 0, x < 1;
1
1
=P X≥
x
x
= P (X ≥ x 1/n ) = 1 − x 1/n , x ≥ 1. H (x) = P (Xn ≤ x) = P X−n ≥ ⎧
⎪ 1 n −1
⎨− x 1
n
h(x) =
⎪
⎩0 Therefore, if x ≥ 1
if x < 1. 13. Cleary, E(X) = (1 + θ )/2. This implies that θ = 2E(X) − 1. Now
Var(X) = E X2 − E (X)
Therefore,
E X2 − 1+θ
2 This yields,
E X2 = 2 = 2 = (1 + θ − 0)2
.
12 1 + 2θ + θ 2
.
12 θ 2 + 2θ + 1
.
3 So
3E(X2 ) − 2θ − 1 = θ 2 .
But θ = 2E(X) − 1; so
3E(X2 ) − 2 2E(X) − 1 − 1 = θ 2 .
This implies that
E(3X 2 − 4X + 1) = θ 2 .
Therefore, one choice for g(X) is g(X) = 3X2 − 4X + 1. 1
−n Section 7.1 Uniform Random Variables 129 14. Let S be the sample space over which X is deﬁned. The functions X : S → R and F : R →
[0, 1] can be composed to obtain the random variable F (X) : S → [0, 1]. Clearly,
if t ≥ 1
if t ≤ 0. 1
0 P F (X) ≤ t = Let t ∈ (0, 1); it remains to prove that P F (X) ≤ t = t . To show this, note that since F
is continuous, F (−∞) = 0, and F (∞) = 1, the inverse image of t , F −1 {t } , is nonempty.
We know that F is nondecreasing; since F is not necessarily strictly increasing, F −1 {t }
might have more than one element. For example, if F is the constant t on some internal
(a, b) ⊆ (0, 1), then F (x) = t for all x ∈ (a, b), implying that (a, b) is contained in F −1 {t } .
Let
x0 = inf x : F (x) > t .
Then F (x0 ) = t and F (x) ≤ t if and only if x ≤ x0 . Therefore,
P F (X) ≤ t = P X ≤ x0 = F (x0 ) = t.
We have shown that ⎧
⎪0
⎨
P F (X) ≤ t = t
⎪
⎩
1 if t ≤ 0
if 0 ≤ t ≤ 1
if t ≥ 1, meaning that F (X) is uniform over (0, 1). 15. We are given that Y is a uniform random variable. First we show that Y is uniform over the
interval (0, 1). To do this, it sufﬁces to show that P (Y ≤ 1) = 1 and P (Y < 0) = 0. These
∞ are obvious implications of the fact that g is nonnegative and
−∞ g(x) dx = 1: X P (Y ≤ 1) = P −∞ g(t) dt ≤ 1 = 1. X P (Y < 0) = P −∞ g(t) dt < 0 = 0, The following relation shows that the probability density function of X is g .
⎛u
⎞
g(t) dt − 0
u
⎟
d
d ⎜ −∞
d
⎜
⎟ = g(u),
g(t) dt =
P (X ≤ u) =
P Y≤
⎝
⎠
du
du
du
1−0
−∞
where the last equality follows from the fundamental theorem of calculus. 130 Chapter 7 Special Continuous Distributions 16. Let F be the distribution function of X, then F (t) = P (X ≤ t) is 0 for t < −1 and is 1 for
t ≥ 4. Let −1 ≤ t < 4; we have that t +1
5
t +1
.
dx =
5 F (t) = P (X ≤ t) = P (5ω − 1 ≤ t) = P ω ≤
= P ω ∈ 0,
Therefore, t +1
5 (t +1)/5 =
0 ⎧
⎪0
⎪
⎪
⎪
⎨
t +1
F (t) =
⎪5
⎪
⎪
⎪
⎩1 t < −1
−1 ≤ t < 4
t ≥ 4. This is the distribution function of a uniform random variable over (−1, 4).
√
17. We have that X = n if and only if Y = 0.y1 ny3 y4 y5 · · · , or, equivalently, if and only if,
√
10 Y = y1 .ny3 y4 y5 · · · . Therefore, X = n if and only if for some k ∈ 0, 1, 2, . . . , 9 ,
k+ √
n+1
n
≤ 10 Y < k +
.
10
10 This is equivalent to
n
1
k+
100
10 2 ≤Y < n+1
1
k+
100
10 2 . Therefore, the desired probability is
9 P
k =0 n
1
k+
100
10
9 =
k =0
9 =
k =0 2 ≤Y < 1
n+1
k+
100
10 2 n+1
1
k+
100
10 2 n
1
k+
100
10 2 − 20k + 2n + 1
= 0.091 + 0.002n.
10, 000 We see that this quantity increases as n does. Section 7.2 7.2 Normal Random Variables 131 NORMAL RANDOM VARIABLES 1. Since np = (0.90)(50) = 45 and np(1 − p) = 2.12,
44.5 − 45
= P (Z ≥ −0.24)
2.12
(−0.24) = (0.24) = 0.5948. P (X ≥ 44.5) = P Z ≥
=1− 2. np = 1095/365 = 3 and np(1 − p) =
P (X ≥ 5.5) = P Z ≥ 3 364
= 1.73. Therefore,
365 5.5 − 3
=1−
1.73 (1.45) = 0.0735. 3. We have that
P (Z ) ≤ x) = P (−x ≤ Z ≤ x) =
= 4. Let (x) − 1 − (x) − (−x) (x) = 2 (x) − 1 = 1
g(x) = P (x < Z < x + α) = √
2π x +α e−y 2 /2 (x). dy. x The number x that maximizes P (x < Z < x + α) is the root of g (x) = 0; that is, it is the
solution of
1
2
2
e−(x +α) /2 − e−x /2 = 0,
g (x) = √
2π
which is x = −α/2.
∞
1
X
2
are, respectively, √
(x cos x)e−x /2 dx ,
1 + X2
2π −∞
∞
∞
1
x
1
2 /2
2 /2
(sin x)e−x dx , and √
e−x dx. Since these are integrals of
√
2π −∞
2π −∞ 1 + x 2 5. E(X cos X), E(sin X), and E odd functions from −∞ to ∞, all three of them are 0. 6. (a) P (X > 35.5) = P X − 35.5
35.5 − 35.5
>
=1−
4.8
4.8 (0) = 0.5. (b) The desired probability is given by
P (30 < X < 40) = P
= 40 − 35.5
30 − 35.5
<X<
= (0.94) − (−1.15)
4.8
4.8
(0.94) + (1.15) − 1 = 0.8264 + 0.8749 − 1 = 0.701. 132 Chapter 7 Special Continuous Distributions 7. Let X be the grade of a randomly selected student;
90 − 67
= 1 − (2.88) = 1 − 0.9980 = 0.002,
8
80 − 67
90 − 67
P (80 ≤ X < 90) = P
≤Z<
= (2.88) − (1.63)
8
8
P (X ≥ 90) = P Z ≥ = 0.9980 − 0.9484 = 0.0496.
Similarly, P (70 ≤ X < 80) = 0.3004, P (60 ≤ X < 70) = 0.4586, and P (X < 60) =
0.1894. Therefore, approximately 0.2%, 4.96%, 30.04%, 45.86%, and 18.94% get A, B, C, D,
and F, respectively. 8. Let X be the blood pressure of a randomly selected person;
P (89 < X < 96) = P 89 − 80
96 − 80
<Z<
= P (1.29 < Z < 2.29) = 0.0875,
7
7 P (X > 95) = P Z > 95 − 80
= 0.016.
7 Therefore, 8.75% have mild hypertension while 1.6% are hypertensive. 9. P (74.5 < X < 75.8) = P (−0.5 < Z < 0.8) = (0.8) − 1 − (0.5) = 0.4796. 10. We must ﬁnd x so that P (110 − x < X < 110 + x) = 0.50, or, equivalently,
P− x
X − 110
x
<
<
= 0.50.
20
20
20 Therefore, we must ﬁnd the value of x which satisﬁes P − x/20 < Z < x/20 = 0.50 or
(x/20)− (−x/20) = 0.50. Since (−x/20) = 1− (x/20), x satisﬁes 2 (x/20) = 1.50
or (x/20) = 0.75. Using Table 1 of the appendix, we get x/20 = 0.67 or x = 13.4 So the
desired interval is (110 − 13.4, 110 + 13.4) = (96.6, 123.4). 11. Let X be the amount of cereal in a box. We want to have P (X ≥ 16) ≥ 0.90. This gives
P Z≥ 16 − 16.5
≥ 0.90,
σ or (0.5/σ ) ≥ 0.90. The smallest value for 0.5/σ satisfying this inequality is 1.29; so the
largest value for σ is obtained from 0.5/σ = 1.29. This gives σ = 0.388. 12. Let X be the score of a randomly selected individual;
P (X ≥ 14) = P Z ≥ 14 − 12
= P (Z ≥ 0.67) = 0.2514.
3 Therefore, the probability that none of the eight individuals make a score less than 14 is
(0.2514)8 = 0.000016. Section 7.2 Normal Random Variables 133 13. We want to ﬁnd t so that P (X ≤ t) = 1/2. This implies that
P
or t −µ
1
X−µ
≤
=,
σ
σ
2 1
t −µ
t −µ
= ; so
= 0 which gives t = µ.
σ
2
σ 14. We have that
P (X − µ > kσ ) = P (X − µ > kσ ) + P (X − µ < −kσ ) = P (Z > k) + P (Z < −k)
= 1− (k) + 1 − (k) = 2 1 − (k) . This shows that P (X − µ > kσ ) does not depend on µ or σ . 15. Let X be the lifetime of a randomly selected light bulb.
900 − 1000
=1−
100
Hence the company’s claim is false.
P (X ≥ 900) = P Z ≥ (−1) = (1) = 0.8413. 16. Let X be the lifetime of the light bulb manufactured by the ﬁrst company. Let Y be the
lifetime of the light bulb manufactured by the second company. Assuming that X and Y are
independent, the desired probability, P max(X, Y ) ≥ 980 , is calculated as follows.
P max(X, Y ) ≥ 980 = 1 − P max(X, Y ) < 980 = 1 − P (X < 980, Y < 980)
= 1 − P (X < 980) P (Y < 980)
=1−P Z < 980 − 900
980 − 1000
P Z<
100
150 = 1 − P (Z < −0.2)P (Z < 0.53) = 1 − 1 − (0.2) (0.53) = 1 − (1 − 0.5793)(0.7019) = 0.7047. 17. Let r be the rate of return of this stock; r is a normal random variable with mean µ = 0.12 and standard deviation σ = 0.06. Let n be the number of shares Mrs. Lovotti should purchase.
We want to ﬁnd the smallest n for which the probability of proﬁt in one year is at least $1000.
Let X be the current price of the total shares of the stock that Mrs. Lovotti buys this year,
and Y be the total price of the shares next year. We want to ﬁnd the smallest n for which
P (Y − X ≥ 1000). We have
Y −X
1000
1000
≥
=P r≥
X
X
X
⎛
⎞
1000
− 0.12 ⎟
⎜
1000
35n
⎟ ≥ 0.90.
=P r≥
= P ⎜Z ≥
⎝
⎠
35n
0.06 P (Y − X ≥ 1000) = P 134 Chapter 7 Special Continuous Distributions Therefore, we want to ﬁnd the smallest n for which
⎛
⎞
1000
− 0.12 ⎟
⎜
35n
⎟ ≤ 0.10.
P ⎜Z ≤
⎝
⎠
0.06
By Table 1 of the Appendix, this is satisﬁed if
1000
− 0.12
35n
≤ −1.29.
0.06
This gives n ≥ 670.69. Therefore, Mrs. Lovotti should buy 671 shares of the stock. 18. We have that
1
(x − 1)2
1
(x − 1)2
f (x) = √ √ exp −
=
.
√ exp −
1/2
2(1/4)
1/2 π
(1/2) 2π
This shows that f is the probability density function of a normal random variable with mean
1 and standard deviation 1/2 (variance 1/4). 19. Let F be the distribution function of X − µ. F (t) = 0 if t < 0; for t ≥ 0,
F (t) = P X − µ ≤ t = P (−t ≤ X − µ ≤ t)
t
X−µ
t
= P (µ − t ≤ X ≤ µ + t) = P − ≤
≤
σ
σ
σ
t
t
t
t
−
−
=
− 1−
=2
=
σ
σ
σ
σ
Therefore,
F (t) = ⎧
⎨2
⎩ t
σ −1 0 t
σ t ≥0
otherwise. This gives
F (t) = 2
σ t
σ Hence ∞ E X − µ = t
0 t ≥ 0.
2
σ t
d t.
σ substituting u = t/σ , we obtain
∞
2σ
2
(u) du = √
ue−u /2 du
2π 0
0
∞
2
2σ
2σ
2
=σ
=√
.
=√
− e−u /2
0
π
2π
2π E(X − µ) = 2σ ∞ u − 1. Section 7.2 Normal Random Variables 135 20. The general form of the probability density function of a normal random variable is
f (x) = 1
1
µ
µ2
(x − µ)2
12
= √ exp −
.
x + 2x −
√ exp −
2σ 2
2σ 2
σ
2σ 2
σ 2π
σ 2π Comparing this with the given probability density function, we see that
⎧√
1
⎪ k= √
⎪
⎪
⎪
σ 2π
⎪
⎪
⎪
⎪
⎪2
⎪k = 1
⎪
⎨
2σ 2
⎪2k = − µ
⎪
⎪
⎪
σ2
⎪
⎪
⎪2
⎪µ
⎪
⎪
⎪
⎩ 2 = 1.
2σ
√
Solving the ﬁrst two equations for k and σ , we obtain k = π and σ = 1/(π 2). These and
the third equation give µ = −1/π which satisfy the fourth equation. So k = π and f is the
11
probability density function of N − , 2 .
π 2π 21. Let X be the viscosity of the given brand. We must ﬁnd the smallest x for which P (X ≤ x) ≥
0.90 or P Z ≤
x = 49.9. x − 37
≥ 0.90. This gives
10 x − 37
≥ 0.90 or (x − 37)/10 = 1.29; so
10 22. Let X be the length of the residence of a family selected at random from this town. Since
P (X ≥ 96) = P Z ≥ 96 − 80
= 0.298,
30 using binomial distribution, the desired probability is
2 12
(0.298)i (1 − 0.298)12−i = 0.742.
i 1−
i =0 23. We have
E(eαZ ) = ∞ 1
2
eαx · √ e−x /2 dx
2π
−∞ = eα 2 /2 = eα 2 /2 ∞
−∞
∞
−∞ 1
12
12
√ e− 2 α +αx − 2 x dx
2π
1
1
2
2
√ e− 2 (x −α) dx = eα /2 ,
2π 136 Chapter 7 Special Continuous Distributions ∞ 1
1
1
1
2
2
√ e− 2 (x −α) dx = 1, since √ e− 2 (x −α) is the probability density function
2π
2π
−∞
of a normal random variable with mean α and variance 1.
where 24. For t ≥ 0,
√
√
P (Y ≤ t) = P − t ≤ X ≤ t = P − √
√
t
t
≤Z≤
σ
σ =2 √
t
− 1.
σ Let f be the probability density function of Y . Then
√
t
,
σ d
1
f (t) = P (Y ≤ t) = 2 √
dt
2σ t
⎧
1
t
⎪√
⎨
exp −
2σ 2
f (t) = σ 2π t
⎪
⎩
0 So t ≥ 0. t ≥0
t ≤ 0. 25. For t ≥ 0,
P (Y ≤ t) = P eX ≤ t = P (X ≤ ln t) = P Z ≤ ln t − µ
=
σ ln t − µ
.
σ Let f be the probability density function of Y . We have
f (t) =
So
f (t) = d
1
P (Y ≤ t) =
dt
σt ⎧
⎪
⎨
⎪
⎩ ln t − µ
,
σ t ≥ 0. 1
(ln t − µ)2
√ exp −
2σ 2
σ t 2π t ≥0 0 otherwise. 26. Let f be the probability density function of Y . Since for t ≥ 0,
P (Y ≤ t) = P
we have that X  ≤ t = P X  ≤ t 2 = P − t 2 ≤ X ≤ t 2 = 2 (t 2 ) − 1, ⎧
⎪4t √1 e−t 4 /2
⎨
d
2π
f (t) = P (Y ≤ t) =
⎪
dt
⎩
0 t ≥0
otherwise. 27. Suppose that X is the number of books sold in a month. The random variable X is binomial
with parameters n = (800)(30) = 24, 000 and p = 1/5001. Moreover, E(X) = np = 4.8
√
and σX = np(1 − p) = 2.19. Let k be the number of copies of the bestseller to be ordered Section 7.2 Normal Random Variables 137 every month. We want to have P (X < k) > 0.98 or P (X ≤ k − 1) > 0.98. Using
De MoivreLaplace theorem and making correction for continuity, this inequality is valid if
X − 4.8
k − 1 + 0.5 − 4.8
<
> 0.98.
2.19
2.19 P From Table 1 of the appendix, we have (k − 1 + 0.5 − 4.8)/2.19 = 2.06, or k = 9.81.
Therefore, the store should order 10 copies a month. 28. Let X be the number of light bulbs of type I. We want to calculate P (18 ≤ X ≤ 22).
Since the number of light bulbs is large and half of the light bulbs are type I, we can assume
that
√
√ X is approximately binomial with parameters 40 and 1/2. Note that np = 20 and
np(1 − p) = 10. Using De MoivreLaplace theorem and making correction for continuity,
we have
P (17.5 ≤ X ≤ 22.5) = P
= 17.5 − 20
X − 20
22.5 − 20
≤√
≤√
√
10
10
10
(0.79) − (−0.79) = 2 (0.79) − 1 = 0.5704. Remark: Using binomial distribution, the solution to this problem is
22
i =18 40
i 1
2 i 1
2 40−i = 0.5704. As we see, up to at least 4 decimal places, this solution gives the same answer as obtained
above. This indicates the importance of correction for continuity; if it is ignored, we obtain
0.4714, an answer which is almost 10% lower than the actual answer. 29. Let X be the number of 1’s selected; X is binomial with parameters 100, 000 and 1/40. Thus
√
np = 2500 and np(1 − p) = 49.37. So P (X ≥ 3500) ≈ P Z ≥ 3499.50 − 2500
=1−
49.37 (20.25) = 0. Hence it is fair to say that the algorithm is not accurate. 30. Note that x2
.
1/ ln a
Comparing this with the probability density function of a normal random variable with pa√
rameters µ and σ , we see that µ = 0 and 2σ 2 = 1/ ln a. Thus σ = 1/(2 ln a), and hence
ka −x = k exp − x 2 ln a = k exp −
2 k= 1
=
σ 2π ln a
.
π So, for this value of k , the function f is the probability density function a normal random
√
variable with mean 0 and standard deviation 1/(2 ln a). 138 Chapter 7 Special Continuous Distributions 31. (a) The derivation of these inequalities from the hint is straightforward.
(b) By part (a),
1− 1
1 − (x)
<
< 1.
√
2
2
x
1/(x 2π ) e−x /2 Thus
1 ≤ lim x →∞ 1 − (x)
≤ 1,
√
2
1/(x 2π ) e−x /2 from which (b) follows. 32. By part (b) of Exercise 31,
x
lim P Z > t +
t →∞
t Z ≥ t = lim t →∞ = lim t →∞ x
t
P (Z ≥ t) P Z>t+ 1
x
x √ exp − t + t
t+
2π
t
1
2
√ e−t /2
t 2π 2 2 x2
t2
exp − x − 2 = e−x .
t →∞ t 2 + x
2t = lim 33. Let X be the amount of soft drink in a random bottle. We are given that P (X < 15.5) = 0.07
15.5 − µ
and P (X > 16.3) = 0.10. These imply that
= 0.07 and and
σ
0.90. Using Tables 1 and 2 of the appendix, we obtain 16.3 − µ
=
σ ⎧ 15.5 − µ
⎪
= −1.48
⎪
⎨
σ
⎪ 16.3 − µ = 1.28.
⎪
⎩
σ
Solving these two equations in two unknowns, we obtain µ = 15.93 and σ = 0.29. 34. Let X be the height of a randomly selected skeleton from group 1. Then
P (X > 185) = P Z > 185 − 172
= P (Z > 1.44) = 0.0749.
9 Section 7.3 Exponential Random Variables 139 Now suppose that the skeleton’s of the second group belong to the family of the ﬁrst group.
The probability of ﬁnding three or more skeleton’s with heights above 185 centimeters is
5
i =3 5
(0.0749)i (0.9251)5−i = 0.0037.
i Since the chance of this event is very low, it is reasonable to assume that the second group is
not part of the ﬁrst one. However, we must be careful that in reality, this observation is not
sufﬁcient to make a judgment. In the lack of other information, if a decision is to be made
solely based on this observation, then we must reject the hypothesis that the second group is
part of the ﬁrst one. 35. For t ∈ (0, ∞), let A be the region whose points have a (positive) distance t or less from the
given tree. The area of A is π t 2 . Let X be the distance from the given tree to its nearest tree.
We have that
e−λπ t (λπ t 2 )0
2
P (X > t) = P (no trees in A) =
= e−λπ t .
0!
2 Now by Remark 6.4,
∞ E(X) = ∞ P (X > t) dt = 0 e−λπ t dt.
2 0 √
Letting u =
2λπ t , we obtain
11
E(X) = √ √
λ 2π ∞ e−u 2 /2 0 1
11
= √.
du = √
λ2
2λ 36. Note that dy = xds ; so
∞ I2 =
0 =
0 =
0 7.3 ∞ ∞ ∞
0 ∞ e−(x 2 +x 2 s 2 )/2 ∞ x ds d x =
0 1
1
du d s =
2
2
0
∞
1
π
ds = arctan s
=.
0
1 + s2
2
e−u(1+s ∞ 2 )/2 0 ∞ e−x 2 (1+s 2 )/2 x dx d s (let u = x 2 ) 0 − 2
2
e−u(1+s )/2
2
1+s ∞ ds 0 EXPONENTIAL RANDOM VARIABLES 1. Let X be the time until the next customer arrives; X is exponential with parameter λ = 3.
Hence P (X > x) = e−λx , and P (X > 3) = e−9 = 0.0001234. 140 Chapter 7 Special Continuous Distributions 2. Let m be the median of an exponential random variable with rate λ. Then P (X > m) = 1/2;
thus e−λm = 1/2 or m = ln 2
.
λ 3. For −∞ < y < ∞,
P (Y ≤ y) = P (− ln X ≤ y) = P X ≥ e−y = e−e −y . Thus g(y), the probability density function of Y is given by
g(y) = −y
−y
d
P (Y ≤ y) = e−y · e−e = e−y − e .
dy 4. Let X be the time between the ﬁrst and second heart attacks. We are given that P (X ≤ 5) =
1/2. Since exponential is memoryless, the probability that a person who had one heart attack
ﬁve years ago will not have another one during the next ﬁve years is still P (X > 5) which is
1 − P (X ≤ 5) = 1/2. 5. (a) Suppose that the next customer arrives in X minutes. By the memoryless property, the
desired probability is
1
= 1 − e−5(1/30) = 0.1535.
30
(b) Let Y be the time between the arrival times of the 10th and 11th customers; Y is exponential
with λ = 5. So the answer is
1
P Y≤
= 1 − e−5(1/30) = 0.1535.
30
P X< 6.
2
1
≥
λ
λ
2
1
2
1
+P X− ≤−
=P X− ≥
λ
λ
λ
λ
1
3
+P X ≤−
=P X≥
λ
λ
−λ(3/λ)
−3
=e
+ 0 = e = 0.049787. P X − E(X) ≥ 2σX = P X− 7. (a) P (X > t) = e−λt .
(b) P (t ≤ X ≤ s) = 1 − e−λs − 1 − e−λt = e−λt − e−λs . 8. The number of documents typed by the secretary on a given eighthour working day is Poisson
with parameter λ = 8. So the answer is
∞ e−8 8 i
=1−
i!
i =12 11
i =0 e−8 8 i
= 1 − 0.888 = 0.112.
i! Section 7.3 9. The answer is Exponential Random Variables 141 1
· 12 = 323.33.
18 E 350 − 40N (12) = 350 − 40 10. Mr. Jones makes his phone calls when either A or B is ﬁnished his call. At that time the
remaining phone call of A or B, whichever is not ﬁnished, and the duration of the call of
Mr. Jones both have the same distribution due to the memoryless property of the exponential
distribution. Hence, by symmetry, the probability that Mr. Jones ﬁnishes his call sooner than
the other one is 1/2. 11. Let N(t) be the number of changeofstates occurring in [0, t ]. Let X1 be the time until the
machine breaks down for the ﬁrst time. Let X2 be the time it will take to repair the machine,
X3 be the time since the machine was ﬁxed until it breaks down again, and so on. Clearly, X1 ,
X2 , . . . are the times between consecutive change of states. Since {X1 , X2 , . . . } is a sequence
of independent and identically distributed exponential random variables with mean 1/λ, by
Remark 7.2, N (t) : t ≥ 0 is a Poisson process with rate λ. Therefore, N(t) is a Poisson
random variable with parameter λt . 12. The probability mass function of L is given by
P (L = n) = (1 − p)n−1 p, n = 1, 2, 3, . . . . Hence
P (L > n) = (1 − p)n , n = 0, 1, 2, . . . . Therefore,
P (T ≤ x) = P (L ≤ 1000x) = 1 − P (L > 1000x) = 1 − (1 − p)1000x
= 1 − e1000x ln(1−p) = 1 − e−x [−1000 ln(1−p)] , x > 0. This shows that T is exponential with parameter λ = −1000 ln(1 − p).
∞ 13. (a) We must have
−∞ ce−x  dx = 1; thus
1 c= ∞ e−x  dx −∞ (b) E(X2n+1 ) = ∞
−∞ 1 = ∞ 2 e−x dx = 1
.
2 0 1 2n+1 −x 
e
dx = 0, because the integrand is an odd function.
x
2
E(X2n ) = ∞
−∞ 1 2n −x 
dx =
xe
2 ∞ x 2n e−x dx, 0
∞ because the integrand is an even function. We now use induction to prove that
0 x n e−x dx = n!. For n = 1, the integral is the expected value of an exponential random variable with 142 Chapter 7 Special Continuous Distributions parameter 1; so it equals to 1 = 1!. Assume that the identity is valid for n − 1. Using
integration by parts, we show it for n.
∞ x n e−x dx = − − x n e−x 0 ∞ ∞ + 0 nx n−1 e−x dx = 0 + n(n − 1)! = n!. 0 Hence E(X2n ) = (2n)!.
n+1 14. P [X] = n = P (n ≤ X < n + 1) = λe−λx dx = −e−λx n n+1
n = e−λ n 1 − e−λ . This is the probability mass function of a geometric random variable with parameter p = 1 − e−λ . 15. Let that G(t) = P (X > t) = 1 − F (t). By the memoryless property of X,
P (X > s + t  X > t) = P (X > s),
for all s ≥ 0 and t ≥ 0. This implies that
P (X > s + t) = P (X > s)P (X > t),
or
G(s + t) = G(s)G(t), t ≥ 0, s ≥ 0. (24) Now for arbitrary positive integers n and m, (24) gives that
G
G 11
1
1
1
2
=G
+
=G
G
=G
n
nn
n
n
n 3
21
2
1
1
=G
+
=G
G
=G
n
nn
n
n
n
.
.
.
1m
m
=G
.
G
n
n Also
G(1) = G 1
11
+ + ··· +
nn
n
n terms 2 =G G 1
n 2 , 1
1
=G
n
n 3 , n yields
G 1
= G(1)
n 1/n (25) . Hence
G(m/n) = G(1) m/n . (26) Section 7.3 Exponential Random Variables 143 Now we show that G(1) > 0. If not, G(1) = 0 and by (25), G(1/n) = 0 for all positive
integer n. This and right continuity of G imply that
1
n→∞ n P (X ≤ 0) = F (0) = 1 − G(0) = 1 − G lim
= 1 − lim G
n→∞ 1
= 1 − 0 = 1,
n which is a contradiction to the given fact that X is a positive random variable. Thus G(1) > 0
and we can deﬁne λ = − ln G(1) . This gives
G(1) = e−λ ,
and by (26), G(m/n) = e−λ(m/n) . Thus far, we have proved that for any positive rational t ,
G(t) = e−λt . (27) To prove the same relation for a positive irrational number t , recall from calculus that for each
1
1
positive integer n, there exists a rational number tn in t , t +
. Since t < tn < t + ,
n
n
limn→∞ tn exists and is t . On the other hand because F is right continuous, G = 1 − F is also
right continuous and so
G(t) = lim G(tn ).
n→∞ But since tn is rational, (27) implies that, G(tn ) = e−λtn . Hence
G(t) = lim e−λtn = e−λt .
n→∞ Thus F (t) = 1 − e−λt for all t , and X is exponential.
Remark: If X is memoryless, then P (X ≤ 0) = 0. To see this, note that P (X > s + t  X >
t) = P (X > s) implies P (X ≤ s + t  X > t) = P (X ≤ s). Letting s = t = 0, we get
P (X ≤ 0  X > 0) = P (X ≤ 0). But P (X ≤ 0  X > 0) = 0; therefore P (X ≤ 0) = 0.
This shows that the memoryless property cannot be deﬁned for random variables possessing
nonpositive values with positive probability. 144 Chapter 7 7.4 GAMMA DISTRIBUTIONS Special Continuous Distributions 1. Let f be the probability density function of a gamma random variable with parameters r and
λ. Then
f (x) = λr x r −1 e−λx
.
(r) Therefore,
f (x) = λr +1 r −2 −λx
r −1
λr
x−
− λe−λx x r −1 + e−λx (r − 1)x r −2 = −
xe
.
(r)
(r)
λ This relation implies that the function f is increasing if x < (r − 1)/λ, it is decreasing if
x > (r − 1)/λ, and f (x) = 0 if x = (r − 1)/λ. Therefore, x = (r − 1)/λ is a maximum
of the function f . Moreover, since f has only one root, the point x = (r − 1)/λ is the only
maximum of f . 2. We have that
(λe−λx )(λx)r −1
dx
(let u = cx )
(r)
λe−λu/c (λu/c)r −1
(1/c) du
(r)
(λ/c)e−λu/c (λu/c)r −1
du.
(r) t /c P (cX ≤ t) = P (X ≤ t/c) =
0
t =
0
t =
0 This shows that cX is gamma with parameters r and λ/c. 3. Let N(t) be the number of babies born at or prior to t . N (t) : t ≥ 0 is a Poisson process
with λ = 12. Let X be the time it takes before the next three babies are born. The random
variable X is gamma with parameters 3 and 12. The desired probability is
P (X ≥ 7/24) = ∞
7/24 12e−12x (12x)2
dx = 864
(3) ∞ x 2 e−12x dx. 7/24 Applying integration by parts twice, we get
x 2 e−12x dx = − 1 2 −12x
1
1 −12x
xe
e
− xe−12x −
+ c.
12
72
864 Thus
7
1
1
1 −12x ∞
= 864 − x 2 e−12x − xe−12x −
e
= 0.3208.
7/24
24
12
72
864
Remark: A simpler way to do this problem is to avoid gamma random variables and use the
properties of Poisson processes:
P X≥ PN 7
≤2 =
24 2 PN
i =0 7
=i =
24 2
i =0 e−(7/24)12 (7/24)12
i! i = 0.3208. Section 7.4 Gamma Distributions 145 4.
∞
−∞ ∞ f (x) dx =
0 λe−λx (λx)r −1
λr
dx =
(r)
(r) ∞ e−λx x r −1 dx. 0 Let t = λx ; then dt = λdx , so
∞
−∞ λr
(r)
1
=
(r) f (x) dx = ∞ t r −1 1
· dx
λr −1 λ
0
∞
1
e−t t r −1 dt =
(r) = 1.
(r)
0
e−t · 5. Let X be the time until the restaurant starts to make proﬁt; X is a gamma random variable with
parameters 31 and 12. Thus E(X) = 31/12; that is, two hours and 35 minutes. 6. By the method of Example 5.17, the number of defective light bulbs produced is a Poisson
process at the rate of (200)(0.015) = 3 per hour. Therefore, X, the time until 25 defective
light bulbs are produced is gamma with parameters λ = 3 and r = 25. Hence E(X) = r
25
=
= 8.33.
λ
3 That is, it will take, on average, 8 hours and 20 minutes to ﬁll up the can. 7.
1
=
2 ∞ t −1/2 e−t dt. 0 Making the substitution t = y 2 /2, we get
√
∞
√
2 ∞ −y 2 /2
1
−y 2 /2
e
dy =
e
dy
=2
2
2 −∞
0
∞
√
√
1
2
= π·√
e−y /2 dy = π .
2π −∞ 146 Chapter 7 Special Continuous Distributions Hence
1
3
=
2
2 1
1√
= · π,
2
2 5
3
=
2
2 3
31√
= · · π,
2
22 7
5
=
2
2 5
531√
= · · · π,
2
222 .
.
. n+ 2n + 1
2n − 1 2n − 3
7531√
=
·
··· · · · · π
2
2
2
2222 1
=
2
= 2 2n √
(2n)!
π
(2n) · · · 6 · 4 · 2 √
√
(2n)! π
(2n)! π
=n n
=n
.
2 · 2 · n!
4 · n! 8. (a) Let F be the probability distribution function of Y . For t ≤ 0, F (t) = P (Z 2 ≤ t) = 0.
For t > 0,
F (t) = P (Y ≤ t) = P Z 2 ≤ t = P −
= √
t− − √ t= √
√
t ≤Z≤ t √
t − 1− √ t =2 √
t − 1. Let f be the probability density function of Y . For t ≤ 0, f (t) = 0. For t > 0,
1 −t/2 1 −1/2
e
t
√
1 −t/2
1
1
2
e−t/2 = 2
,
t =√ ·√ e
=√
(1/2)
t
2π
2π t
√
where by the previous exercise, π = (1/2). This shows that Y is gamma with parameters
λ = 1/2 and r = 1/2.
1
f (t) = F (t) = 2 · √
2t (b) Since (X − µ)/σ is standard normal, by part (a), W is gamma with parameters λ = 1/2
and r = 1/2. 9. The following solution is an intuitive one. A rigorous mathematical solution would have to
consider the sum of two random variables, each being the minimum of n exponential random Section 7.5 Beta Distributions 147 variables; so it would require material from joint distributions. However, the intuitive solution
has its own merits and it is important for students to understand it.
Let the time Howard enters the bank be the origin and let N (t) be the number of customers
served by time t . As long as all of the servers are busy, due to the memoryless property of
the exponential distribution, N (t) : t ≥ 0 is a Poisson process with rate nλ. This follows
because if one server serves at the rate λ, n servers will serve at the rate nλ. For the Poisson
process N (t) : t ≥ 0 , every time a customer is served and leaves, an “event” has occurred.
Therefore, again because of the memoryless property, the service time of the person ahead
of Howard begins when the ﬁrst “event” occurs and Howard’s service time begins when the
second “event” occurs. Therefore, Howard’s waiting time in the queue is the time of the
second event of the Poisson process N (t), t ≥ 0 . This period, as we know, has a gamma
distribution with parameters 2 and nλ. 10. Since the lengths of the characters are independent of each other and identically distributed,
for any two intervals 1 and 2 with the same length, the probability that n characters are
emitted during 1 is equal to the probability that n characters are emitted in 2 . Moreover, for
s > 0, the number of characters being emitted during (t, t + s ] is independent of the number of
characters that have been emitted in [0, t ]. Clearly, characters are not emitted simultaneously.
Therefore, N (t) : t ≥ 0 is stationary, possesses independent increments, and is orderly. So
it is a Poisson process. By Exercise 11, Section 7.3, the time until the ﬁrst character is emitted
is exponential with parameter λ = −1000 ln(1 − p). Thus N (t) : t ≥ 0 is a Poisson process
with parameter λ = −1000 ln(1 − p). Knowing this, we have that the time until the message
is emitted, that is, the time until the k th character is emitted is gamma with parameters k and
λ = −1000 ln(1 − p). 7.5 BETA DISTRIBUTIONS 1. Yes, it is a probability density function of a beta random variable with parameters α = 2 and
β = 3. Note that 1
4!
=
= 12. We have
B(2, 3)
1! 2!
E(X) = 2
,
5 VarX = 6
1
=.
6(52 )
25 2. No, it is not because, for α = 3 and β = 5, we have
7!
1
=
= 105 = 120.
B(3, 5)
2 ! 4! 3. Let α = 5 and β = 6. Then f is the probability density function of a beta random variable
with parameters 5 and 6 for
c= 1
10!
=
= 1260.
B(5, 6)
4! 5! 148 Chapter 7 Special Continuous Distributions For this value of c,
E(X) = 5
,
11 VarX = 30
5
=
.
2)
12(11
242 4. The answer is
P (p ≥ 0.60) = 1
0.60 = 1
x 19 (1 − x)12 dx
B(20, 13) 32!
19! 12! 1 x 19 (1 − x)12 dx = 0.538. 0.60 5. Let X be the proportion of resistors the procurement ofﬁce purchases from this vendor. We
know that X is beta. Let α and β be the parameters of the density function of X. Then
⎧
⎪ α =1
⎪
⎪
⎨α + β
3
⎪
⎪
⎪
⎩ 1
αβ
=.
(α + β + 1)(α + β)2
18 Solving this system of 2 equations in 2 unknowns, we obtain α = 1 and β = 2. The desired
probability is
P (X ≥ 7/12) = 1
7/12 1
x 1−1 (1 − x)2−1 dx = 2
B(1, 2) 1
7/12 (1 − x) dx = 50
≈ 0.17.
288 6. Let X be the median of the fractions for the 13 sections of the course; X is a beta random
variable with parameters 7 and 7. Let Y be a binomial random variable with parameters 13
and 0.40. By Theorem 7.2,
P (X ≤ 0.40) = P (Y ≥ 7).
Therefore,
6 13
(0.40)i (0.60)13−i = 0.771156.
i P (X ≥ 0.40) = P (Y ≤ 6) =
i =0 7. Let Y be a binomial random variable with parameters 25 and 0.25; by Theorem 7.2,
P (X ≤ 0.25) = P (Y ≥ 5).
Therefore,
4 P (X ≥ 0.25) = P (Y < 5) =
i =0 25
(0.25)i (0.75)25−i = 0.214.
i Section 7.5 Beta Distributions 149 8. (a) Clearly,
E(Y ) = a + (b − a)E(X) = a + (b − a)
Var(X) = (b − a)2 Var(X) =
(b) α
,
α+β (b − a)2 αβ
.
(α + β + 1)(α + β)2 Note that 0 < X < 1 implies that a < Y < b. Let a < t < b; then
P (Y ≤ t) = P a + (b − a)X ≤ t = P X ≤
(t −a)/(b−a) =
0 t −a
b−a 1
x α−1 (1 − x)β −1 dx.
B(α, β) Let y = (b − a)x + a ; we have
t P (Y ≤ t) =
a
t =
a 1
y−a
B(α, β) b − a α −1 1− 1
1
y−a
·
b − a B(α, β) b − a y−a
b−a β −1 b−y
b−a α −1 · 1
dy
b−a β −1 dy. This shows that the probability density function of Y is
f (y) =
(c) 1
1
y−a
·
b − a B(α, β) b − a α −1 b−y
b−a β −1 , a < y < b. Note that a = 2, b = 6. Hence
3 P (Y < 3) =
2 = 3
64 1 4! y − 2
·
4 1! 2!
4
3 6−y
4 2 dy
67
3 67
·
=
≈ 0.26.
64 12
256 (y − 2)(6 − y)2 dy = 2 9. Suppose that
f (x) = 1
x α−1 (1 − x)β −1 ,
B(α, β) 0 < x < 1, is symmetric about a point a . Then f (a − x) = f (a + x). That is, for 0 < x < min(a, 1 − a),
(a − x)α−1 (1 − a + x)β −1 = (a + x)α−1 (1 − a − x)β −1 . (28) Since α and β are not necessarily integers, for (a − x)α−1 and (1 − a − x)β −1 to be welldeﬁned,
we need to restrict ourselves to the range 0 < x < min(a, 1 − a). Now, if a < 1 − a , then,
by continuity, (28) is valid for x = a . Substituting a for x in (28), we obtain
(2a)α−1 (1 − 2a)β −1 = 0. 150 Chapter 7 Special Continuous Distributions Since a = 0, this implies that a = 1/2. If 1 − a < a , then, by continuity, (28) is valid for
x = 1 − a . Substituting 1 − a for x in (28), we obtain
(2a − 1)α−1 (2 − 2a)β −1 = 0.
Since a = 1, this implies that a = 1/2. Therefore, in either case a = 1/2. In (28), substituting
a = 1/2, and taking x = 1/4, say, we get
(1/4)α−1 (3/4)β −1 = (3/4)α−1 (1/4)β −1 .
This gives 3β −α = 0, which can only hold for α = β. Therefore, only beta density functions
with α = β are symmetric, and they are symmetric about a = 1/2. 10. t = 0 gives x = 0; t = ∞ gives x = 1. Since dx =
∞ B(α, β) = t2
1 + t2 0 α −1 β −1 1
1 + t2 11. We have that 1 B(α, β) = · 2t
dt , we have
(1 + t 2 )2
∞ 2t
dt = 2
(1 + t 2 )2 t 2α−1 (1 + t 2 )−(α+β) dt. 0 x α−1 (1 − x)β −1 dx. 0 Let x = cos θ to obtain
2 π /2 B(α, β) = 2 (cos θ)2α−1 (sin θ)2β −1 dθ. 0 Now ∞ (α) = t α−1 e−t dt. 0 Use the substitution t = y to obtain
2 ∞ (α) = 2 y 2α−1 e−y dy.
2 0 This implies that
∞ (α) (β) = 4
0 ∞ x 2α−1 y 2β −1 e−(x 2 +y 2 ) dxdy. 0 Now we evaluate this double integral by means of a change of variables to polar coordinates:
y = r sin θ , x = r cos θ ; we obtain
∞ (α) (β) = 4
0 π /2
0 r 2(α+β)−1 (cos θ)2α−1 (sin θ)2β −1 e−r dθdr
2 ∞ = 2B(α, β) ∞ r 2(α+β)−1 e−r dr = B(α, β) 0 = B(α, β) (α + β). 2 0 uα+β −1 e−u du (let u = r 2 ) Section 7.5 Thus
B(α, β) = Beta Distributions 151 (α) (β)
.
(α + β) 12. We will show that E(X2 ) = n/(n − 2). Since E(X2 ) < ∞, by Remark 6.6, E(X) < ∞.
Since E(X) exists and xf (x) is an odd function, we have
E(X) = ∞
−∞ xf (x) dx = 0. Consequently,
Var(X) = E(X2 ) − E (X) 2 = Therefore, all we need to ﬁnd is E(X2 ). By Theorem 6.3,
n+1
∞
x2
2
x2 1 +
E(X 2 ) = √
n
n
−∞
nπ
2
√
Substituting x = ( n )t in this integral yields n
.
n−2 −(n+1)/2 dx. n+1
∞
√
2
E(X2 ) = √
(nt 2 )(1 + t 2 )−(n+1)/2 n dt
n
−∞
nπ
2
n+1
∞
= √ 2 n · 2n
t 2 (1 + t 2 )−(n+1)/2 dt.
0
π
2
By the previous two exercises,
n−2
3
∞
3 n−2
2
2
.
=
t 2 (1 + t 2 )−(n+1)/2 dt = B ,
2
n+1
2
2
0
2
Therefore,
3
3
n−2
n−2
n+1
n
2
2
2
2
2
.
=
E(X2 ) = √
√
n ·n·
n
n+1
π·
π
2
2
2
√
By the solution to Exercise 7, Section 7.4, (1/2) = π . Using the identity (r + 1) = r (r),
we have
√
3
11
π
=
=
;
2
22
2
n
n−2
n−2 n−2
=
+1 =
.
2
2
2
2 152 Chapter 7 Special Continuous Distributions Consequently, √
n−2
π
2
2
E(X2 ) =
√
n−2 n−2
π·
2
2
n 7.6 = n
.
n−2 SURVIVAL ANALYSIS AND HAZARD FUNCTIONS 1. Let X be the lifetime of the electrical component, F be its probability distribution function,
and λ(t) be its failure rate. For some constants α and β , we are given that
λ(t) = αt + β.
Since λ(48) = 0.10 and λ(72) = 0.15,
48α + β = 0.10
72α + β = 0.15.
Solving this system of two equations in two unknowns gives α = 1/480 and β = 0. Hence
λ(t) = t/480. By (7.6), for t > 0,
t ¯
P (X > t) = F (t) = exp −
0 u
2
du = e−t /960 .
480 Let f be the probability density function of X. This also gives
f (t) = − d¯
t −t 2 /960
F (t) =
e
.
dt
480 The answer to part (a) is
P (X > 30) = e−900/960 = e−0.9375 = 0.392.
The exact value for part (b) is
P (X < 31  X > 30) =
= P (30 < X < 31)
P (X > 30)
1
0.392 31
30 (t/480)e−t 2 /960 dt = 0.02411
= 0.0615.
0.392 Note that for small t , λ(t) t is approximately the probability that the component fails
within t hours after t , given that it has not yet failed by time t . Letting t = 1, for t = 30,
λ(t) t ≈ 0.0625 which is relatively close to the exact value of 0.0615. This is interesting
because t = 1 is not that small, and one may not expect close approximations anyway. Chapter 7 Review Problems 153 ¯
2. Let F be the survival function of a Weibull random variable. We have
∞ ¯
F (t) = αx α−1 e−x dx.
α t Letting u = x α , we have du = αx α−1 dx. Thus
¯
F (t) = ∞
tα e−u du = −e−u Therefore, ∞
tα = e−t .
α αt α−1 e−t
= αt α−1 ·
λ(t) =
e−t α
λ(t) = 1, for α = 1; so the Weibull in this case is exponential with parameter 1. Clearly, for
α < 1, λ (t) < 0; so λ(t) is decreasing. For α > 1, λ (t) > 0; so λ(t) is increasing. Note that
for α = 2, the failure rate is the straight line λ(t) = 2t.
α REVIEW PROBLEMS FOR CHAPTER 7
1. 30 − 25
5
=
.
37 − 25
12 2. Let X be the weight of a randomly selected women from this community. The desired quantity
is
P (X > 170)
=
P (X > 170  X > 140) =
P (X > 140) = 170 − 130
20
140 − 130
P Z>
20
P Z> 1 − (2)
1 − 0.9772
P (Z > 2)
=
=
= 0.074.
P (Z > 0.5)
1 − (0.5)
1 − 0.6915 3. Let X be the number of times the digit 5 is generated; X is binomial with parameters n = 1000
√
√
and p = 1/10. Thus np = 100 and np(1 − p) =
and making correction for continuity,
P (X ≤ 93.5) = P Z ≤ 90 = 9.49. Using normal approximation 93.5 − 100
= P (Z ≤ −0.68) = 1 −
9.49 4. The given relation implies that
1 − e−2λ = 2 (1 − e−3λ ) − (1 − e−2λ ) . (0.68) = 0.248. 154 Chapter 7 Special Continuous Distributions This is equivalent to
3e−2λ − 2e−3λ − 1 = 0,
or, equivalently,
e−λ − 1 2 2e−λ + 1 = 0. The only root of this equation is λ = 0 which is not acceptable. Therefore, it is not possible
that X satisfy the given relation. 5. Let X be the lifetime of a random light bulb. Then
P (X < 1700) = 1 − e−(1/1700)·1700 = 1 − e−1 .
The desired probability is
1 − P (none fails) − P (one fails)
20
20
=1−
(1 − e−1 )0 (e−1 )20 −
0
1 1 − e−1 e−1 19 = 0.999999927. 6. Note that limx →0 x ln x = 0; so
1 E(− ln X) = 1 (− ln x) dx = x − x ln x 0 0 = 1. 7. Let X be the diameter of the randomly chosen disk in inches. We are given that X ∼ N(4, 1).
We want to ﬁnd the distribution function of 2.5X ; we have
1
P (2.5X ≤ x) = P (X ≤ x/2.5) = √
2π x /2.5 e−(t −4) 2 /2 dt. −∞ 8. If α < 0, then α + β < β ; therefore,
P (α ≤ X ≤ α + β) = P (0 ≤ X ≤ α + β) ≤ P (0 ≤ X ≤ β).
If α > 0, then e−λα < 1. Thus
P (α ≤ X ≤ α + β) = 1 − e−λ(α+β) − 1 − e−λα
= e−λα 1 − e−λβ < 1 − e−λβ = P (0 ≤ X ≤ β). 9. We are given that 1/λ = 1.25; so λ = 0.8. Let X be the time it takes for a random student to
complete the test. Since P (X > 1) = e−(0.8)1 = e−0.8 , the desired probability is
1 − e−0.8 10 = 1 − e−8 = 0.99966. Chapter 7 10. Note that f (x) = ke−[x −(3/2)] 2 +17/4 Review Problems 155 = ke17/4 · e−[x −(3/2)] .
2 Comparing this with the probability density function of a normal random variable with mean
√
3/2, we see that σ 2 = 1/2 and ke17/4 = 1/(σ 2π ). Therefore,
k= 1
1
√ e−17/4 = e−17/4 .
π
σ 2π 11. Let X be the grade of a randomly selected student.
P (X ≥ 90) = P Z ≥ 90 − 72
=1−
7 (2.57) = 0.0051. Similarly,
P (80 ≤ X < 90) = P (1.14 ≤ Z < 2.57) = 0.122,
P (70 ≤ X < 80) = P (−0.29 ≤ Z < 1.14) = 0.487,
P (60 ≤ X < 70) = P (−1.71 ≤ Z < −0.29) = 0.3423,
P (X < 60) = P (Z < −1.71) = 0.0436.
Therefore, approximately 0.51% will get A, 12.2% will get B, 48.7% will get C, 34.23% D,
and 4.36% F. 12. Since E(X) = 1/λ,
P X > E(X) = e−λ(1/λ) = e−1 = 0.36788. 13. Round off error to the nearest integer is uniform over (−0.5, 0.5); round off error to the nearest 1st decimal place is uniform over (−0.05, 0.05); round off error to the nearest 2nd decimal
place is uniform over (−0.005, 0.005), and so on. In general, round off error to the nearest k
decimal places is uniform over (−5/10k+1 , 5/10k+1 ). 14. We want to ﬁnd the smallest a for which P (X ≤ a) ≥ 0.90. This implies
P Z≤ a − 175
≥ 0.90.
22 Using Table 1 of the appendix, we see that (a − 175)/22 = 1.29 or a = 203.38. 15. Let X be the breaking strength of the yarn under consideration. Clearly,
P (X ≥ 100) = P Z ≥ 100 − 95
=1−
11 (0.45) = 0.33. So the desired probability is
1− 10
10
(0.33)1 (0.67)9 = 0.89.
(0.33)0 (0.67)10 −
1
0 156 Chapter 7 Special Continuous Distributions 16. Let X be the time until the 91st call is received. X is a gamma random variable with parameters
r = 91 and λ = 23. The desired probability is
∞ P (X ≥ 4) =
4 23e−23x (23x)91−1
dx
(91)
4 =1−
0 =1− 23e−23x (23x)91−1
dx
90! 2391
90! 4 x 90 e−23x dx = 1 − 0.55542 = 0.44458. 0 17. Clearly,
E(X) = (1 − θ) + (1 + θ)
= 1,
2 Var(X) = θ2
(1 + θ − 1 + θ)2
=.
12
3 Now
E X2 − E (X) 2 = θ2
3 implies that
θ2
+ 1,
3
which yeilds 3E(X2 ) − 1 = θ 2 , or, equivalently, E(3X2 − 1) = θ 2 . Therefore, one choice for
g(X) is g(X) = 3X2 − 1.
E X2 = 18. Let α and β be the parameters of the density function of X/ . Solving the following two
equations in two unknowns,
E(X/ ) =
Var(X/ ) = α
3
=,
α+β
7
αβ
3
=,
2
(α + β + 1)(α + β)
98 we obtain α = 3 and β = 4. Therefore, X/ is beta with parameters 3 and 4. The desired
probability is
P ( /7 < X < /3) = P (1/7 < X/ < 1/3) = 1/3
1/7 = 60 1/3
1/7 1
x 2 (1 − x)3 dx
B(3, 4) x 2 (1 − x)3 dx = 0.278. Chapter 8 Bivariate Distributions
8.1 JOINT DISTRIBUTIONS OF TWO RANDOM VARIABLES 1. (a) 2
x =1 2
y =1 k(x/y) = 1 implies that k = 2/9. (b) pX (x) = 2
y =1 (2x)/(9y) = x/3, pY (y) = 2
x =1 (2x)/(9y) = 2/(3y), (c) P (X > 1  Y = 1) =
2 2 (d) E(X) = x·
y =1 x =1 2. (a) 3
x =1 2
y =1 x = 1, 2 .
y = 1, 2. p(2, 1)
2
4/9
=.
=
pY (1)
2/3
3 2x
5
=;
9y
3 2 2 y· E(Y ) =
y =1 x =1 2x
4
=.
9y
3 c(x + y) = 1 implies that c = 1/21. (b) pX (x) = 2
y =1 (1/21)(x + y) = (2x + 3)/21. x = 1, 2, 3. pY (y) = 3
x =1 (1/21)(x + y) = (6 + 3y)/21. y = 1, 2. (c) P (X ≥ 2  Y = 1) =
3 2 (d) E(X) =
x =1 y =1 p(2, 1) + p(3, 1)
7/21
7
=
=.
pY (1)
9/21
9 46
1
x(x + y) = ;
21
21 3 2 E(Y ) =
x =1 y =1 11
1
y(x + y) = .
21
7 3. (a) k(1 + 1 + 1 + 9 + 4 + 9) = 1 implies that k = 1/25.
(b) pX (1) = p(1, 1) + p(1, 3) = 12/25, pX (2) = p(2, 3) = 13/25; pY (1) = p(1, 1) = 2/25, pY (3) = p(1, 3) + p(2, 3) = 23/25. 158 Chapter 8 Bivariate Distributions
Therefore,
pX (x) = (c) E(X) = 1 · ⎧
⎨12/25 if x = 1 ⎩13/25 if x = 2, 13
38
12
+2·
=;
25
25
25 pY (y) = E(Y ) = 1 · ⎧
⎨2/25 if y = 1 ⎩23/25 if y = 3. 2
23
71
+3·
=.
25
25
25 4. P (X > Y ) = p(1, 0) + p(2, 0) + p(2, 1) = 2/5, P (X + Y ≤ 2) = p(1, 0) + p(1, 1) + p(2, 0) = 7/25,
P (X + Y = 2) = p(1, 1) + p(2, 0) = 6/25. 5. Let X be the number of sheep stolen; let Y be the number of goats stolen. Let p(x, y) be the
joint probability mass function of X and Y . Then, for 0 ≤ x ≤ 4, 0 ≤ y ≤ 4, 0 ≤ x + y ≤ 4, p(x, y) = 7
x 8
y 5
4−x−y
;
20
4 p(x, y) = 0, for other values of x and y . 6. The following table gives p(x, y), the joint probability mass function of X and Y ; pX (x), the
marginal probability mass function of X ; and pY (y), the marginal probability mass function
of Y .
y
x
2
3
4
5
6
7
8
9
10
11
12
pY (y) 0
1/36
0
1/36
0
1/36
0
1/36
0
1/36
0
1/36
6/36 1
0
2/36
0
2/36
0
2/36
0
2/36
0
2/36
0
10/36 2
0
0
2/36
0
2/36
0
2/36
0
2/36
0
0
8/36 3
0
0
0
2/36
0
2/36
0
2/36
0
0
0
6/36 4
0
0
0
0
2/36
0
2/36
0
0
0
0
4/36 7. p(1, 1) = 0, p(1, 0) = 0.30, p(0, 1) = 0.50, p(0, 0) = 0.20. 5
0
0
0
0
0
2/36
0
0
0
0
0
2/36 pX (x)
1/36
2/36
3/36
4/36
5/36
6/36
5/36
4/36
3/36
2/36
1/36 Section 8.1 Joint Distributions of Two Random Variables 159 8. (a) For 0 ≤ x ≤ 7, 0 ≤ y ≤ 7, 0 ≤ x + y ≤ 7,
13
x p(x, y) = 13
y 26
7−x−y
.
52
7 For all other values of x and y , p(x, y) = 0.
3
y =0 (b) P (X ≥ Y ) = 7−y
x =y p(x, y) = 0.61107.
1 x 9. (a) fX (x) = 2 dy = 2x, 0 ≤ x ≤ 1; fY (y) = 0
1 (b) E(X) = 0 ≤ y ≤ 1. x(2x) dx = 2/3; 0
1 E(Y ) = 1 xfX (x) dx = 0 1 yfY (y) dy = 0 2y(1 − y) dy = 1/3. 0 1
=
2 (c) P X < 2 dx = 2(1 − y), y 1/2 1/2 fX (x) dx = 0 0
1 P (X < 2Y ) =
0 x 2 dy dx =
x/2 1
2x dx = ,
4 1
,
2 P (X = Y ) = 0.
x 10. (a) fX (x) = 8xy dy = 4x 3 , 0 ≤ x ≤ 1, 0
1 fY (y) = 8xy dx = 4y(1 − y 2 ), 0 ≤ y ≤ 1. y
1 (b) E(X) =
0 2
0 1 yfY (y) dy = 0 11. fX (x) = x · 4x 3 dx = 4/5; 0
1 E(Y ) = 1 xfX (x) dx = 1 −x
ye dy = e−x ,
2 y · 4y(1 − y 2 ) dy = 8/15. 0 x > 0; ∞ fY (y) =
0 1
1 −x
ye dx = y,
2
2 0 < y < 2. 12. Let R = (x, y) : 0 ≤ x ≤ 1, 0 ≤ y ≤ 1 . Since area(R) = 1, P (X + Y ≤ 1/2) is the area of the region (x, y) ∈ R : x + y ≤ 1/2 which is 1/8. Similarly, P (X − Y ≤ 1/2) is the 160 Chapter 8 Bivariate Distributions area of the region (x, y) ∈ R : x − y ≤ 1/2 which is 7/8. P (X 2 + Y 2 ≤ 1) is the area of
the region (x, y) ∈ R : x 2 + y 2 ≤ 1 which is π/4. P (XY ≤ 1/4) is the sum of the area
of the region (x, y) : 0 ≤ x ≤ 1/4, 0 ≤ y ≤ 1 which is 1/4 and the area of the region under
the curve y = 1/(4x) from 1/4 to 1. (Draw a ﬁgure.) Therefore, P (XY ≤ 1/4) = 1 13. (a) The area of R is
0 x √ √
y y y
1 xfX (x) dx = 0 0 < x < 1; √
6 dx = 6( y − y), 0 < y < 1. 6x 2 (1 − x) dx = 1/2; 0
1 0 if (x, y) ∈ R
elsewhere. y
1 E(Y ) = 6
0 6 dy = 6x(1 − x), x2 f (x, y) dx = (c) E(X) = 1/4 1
dx ≈ 0.597.
4x x f (x, y) dy = x2 fY (y) = 1 1
(x − x 2 ) dx = ; so
6 f (x, y) = (b) fX (x) = 1
+
4 1 yfY (y) dy = √
6y( y − y) dy = 2/5. 0 14. Let X and Y be the minutes past 11:30 that the man and his ﬁancée arrive at the
lobby, respectively. We have that X and Y are uniformly distributed over (0, 30). Let
S = (x, y) : 0 ≤ x ≤ 30, 0 ≤ y ≤ 30 , and R = (x, y) ∈ S : y ≤ x − 12 or y ≥ x + 12 .
The desired probability is the area of R divided by the area of S : 324/900 = 0.36. (Draw a
ﬁgure.)
A.M. 15. Let X and Y be two randomly selected points from the interval (0, ). We are interested in
E X − Y  . Since the joint probability density function of X and Y is f (x, y) = ⎧
⎪1
⎨ 0<x< , 0<y< ⎪
⎩0 elsewhere, 2 Section 8.1 x − y  E X − Y  =
0 0 2 = 6 2 + (y − x) dx d y +
0 1 (x − y) dx d y 2 0 6 161 dx dy y 1 = 1 Joint Distributions of Two Random Variables 0 y =.
3 16. The problem is equivalent to the following: Two random numbers X and Y are selected at
random and independently from (0, ). What is the probability that X − Y  < X? Let
S = (x, y) : 0 < x < , 0 < y < and
R = (x, y) ∈ S : x − y  < x = (x, y) ∈ S : y < 2x .
The desired probability is the area of R which is 3 2 /4 divided by
(Draw a ﬁgure.) 2 . So the answer is 3/4. 17. Let S = (x, y) : 0 < x < 1, 0 < y < 1 and R = (x, y) ∈ S : y ≤ x and x 2 + y 2 ≤ 1 .
The desired probability is the area of R which is π/8 divided by the area of S which is 1. So
the answer is π/8. 18. We prove this for the case in which X and Y are continuous random variables with joint
probability density function f . For discrete random variables the proof is similar. The relation
P (X ≤ Y ) = 1, implies that f (x, y) = 0 if x > y . Hence by Theorem 8.2,
E(X) =
=
≤
= ∞ ∞ −∞
∞ −∞
y −∞
∞ −∞
y −∞
∞ −∞
∞ −∞ −∞ xf (x, y) dx dy
xf (x, y) dx dy
yf (x, y) dx dy
yf (x, y) dx dy = E(Y ). 19. Let H be the distribution function of a random variable with probability density function h.
x That is, let H (x) = h(y) dy. Then
−∞ P (X ≥ Y ) =
= ∞
−∞
∞
−∞ x
−∞ h(x)h(y) dy dx = h(x)H (x) dx = 1
H (x)
2 ∞ x h(x) h(y) dy d x −∞
2 ∞
−∞ −∞ = 12
1
(1 − 02 ) = .
2
2 20. Since 0 ≤ 2G(x) − 1 ≤ 1, 0 ≤ 2H (y) − 1 ≤ 1, and −1 ≤ α ≤ 1, we have that
−1 ≤ α 2G(x) − 1 2H (y) − 1 ≤ 1. 162 Chapter 8 Bivariate Distributions So
0 ≤ 1 + α 2G(x) − 1 2H (y) − 1 ≤ 2.
This and g(x) ≥ 0, h(y) ≥ 0 imply that f (x, y) ≥ 0. To prove that f is a joint probability
∞ ∞ −∞ −∞ ∞ ∞ −∞ −∞
∞ density function, it remains to show that
∞ ∞ −∞ −∞ f (x, y) dx dy = 1. f (x, y) dx dy
= ∞ ∞ −∞ −∞ =1+α g(x)h(y) dx dy + α
∞ −∞ h(y) 2H (y) − 1 d y −∞ 1
2∞ 1
2H (y) − 1
2G(x) − 1
−∞ 4
4
Now we calculate the marginals.
=1+α fX (x) =
= ∞
−∞
∞
−∞ g(x)h(y) 2G(x) − 1 2H (y) − 1 d x dy
g(x) 2G(x) − 1 d x
2∞
−∞ = 1 + α · 0 · 0 = 1. g(x)h(y) 1 + α 2G(x) − 1 2H (y) − 1
g(x)h(y) dy + α = g(x) ∞
−∞ ∞
−∞ dy g(x)h(y) 2G(x) − 1 2H (y) − 1 d y h(y) dy + αg(x) 2G(x) − 1 ∞
−∞
2∞ h(y) 2H (y) − 1 d y 1
2H (y) − 1
−∞
4
= g(x) + αg(x) 2G(x) − 1 · 0 = g(x) + 0 = g(x).
= g(x) + αg(x) 2G(x) − 1 Similarly, fY (y) = h(y). 21. Orient the circle counterclockwise and let X be the length of the arc N M and Y be length of
the arc NL. Let R be the radius of the circle; clearly, 0 ≤ X ≤ 2π R and 0 ≤ Y ≤ 2π R .
The angle MN L is acute if and only if Y − X  < π R. Therefore, the sample space of this
experiment is
S = (x, y) : 0 ≤ x ≤ 2π R, 0 ≤ y ≤ 2π R
and the desired event is
E = (x, y) ∈ S : y − x  < πR .
The probability that M N L is acute is the area of E which is 3π 2 R 2 divided by the area of S
which is 4π 2 R 2 ; that is, 3/4. 22. Let
S = (x, y) ∈ R2 : 0 ≤ x ≤ 1, 0 ≤ y ≤ 1 , A = (x, y) ∈ S : 0 < x + y < 0.5 , B = (x, y) ∈ S : 0.5 < x + y < 1.5 , C = (x, y) ∈ S : x + y > 1.5 . Section 8.1 Joint Distributions of Two Random Variables The probability that the integer nearest to x + y is 0 is
integer nearest to x + y is 1 is
x + y is 2 is 163 1
area(A)
= , The probability that the
area (S)
8 area(B)
3
= , and the probability that the nearest integer to
area(S)
4 area (C)
1
=.
area(S)
8 4
=4
3
ways we can select three of X , a − X , Y , and b − Y . If X, a − X, and Y are selected, a
triangular pen is possible to make if and only if X < (a − X) + Y , a − X < X + Y , and
Y < X + (a − X). The probability of this event is the area of 23. Let X be a random number from (0, a) and Y be a random number from (0, b). In (x, y) ∈ R2 : 0 < x < a, 0 < y < b, 2x − y < a, 2x + y > a, y < a
which is a 2 /2 divided by the area of
S = (x, y) ∈ R2 : 0 < x < a, 0 < y < b
which is ab: (a 2 /2)/ab = a/(2b). Similarly, for each of the other three 3combinations of
X , a − x , Y , and b − Y also the probability that the three segments can be used to form a
triangular pen is a/(2b). Thus the desired probability is
1a
1a
1a
a
1a
·
+·
+·
+·
=
.
4 2b 4 2b 4 2b 4 2b
2b 24. Let X and Y be the two points that are placed on the segment. Let E be the event that the length
of none of the three parts exceeds the given value α . Clearly, P (E  X < Y ) = P (E  Y < X)
and P (X < Y ) = P (Y < X) = 1/2. Therefore,
P (E) = P (E  X < Y )P (X < Y ) + P (E  Y < X)P (Y < X)
1
1
= P (E  X < Y ) + P (E  X < Y ) = P (E  X < Y ).
2
2
This shows that for calculation of P (E), we may reduce the sample space to the case where
X < Y . The reduced sample space is
S = (x, y) : x < y, 0 < x < , 0 < y < . The desired probability is the area of
R = (x, y) ∈ S : x < α, y − x < α, y > − α
divided by area(S) = 2 /2. But ⎧
⎪ (3α − )2
⎪
⎪
⎨
2
area(R) =
⎪2 32
α
⎪−
⎪
1−
⎩
2
2 if 3 2 if 2 ≤α≤ 2 ≤α≤ . 164 Chapter 8 Bivariate Distributions Hence the desired probability is P (E) = ⎧ 3α
⎪
−1
⎪
⎨ 2 ⎪
⎪1 − 3 1 − α
⎩ if 3 2 if 2 ≤α≤ 2 ≤α≤ . 25. R is the square bounded by the lines x + y = 1, −x + y = 1, −x − y = 1, and x − y = 1; its
area is 2. To ﬁnd the probability density function of X, the x coordinate of the point selected
at random from R , ﬁrst we calculate P (X ≤ t), ∀t . For −1 ≤ t < 0, P (X ≤ t) is the area of
the triangle bound by the lines −x + y = 1, −x − y = 1, and x = t which is (1 + t)2 divided
by area(R) = 2. (Draw a ﬁgure.) For 0 ≤ t < 1, P (X ≤ t) is the area inside R to the left of
the line x = t which is 2 − (1 − t)2 divided by area(R) = 2. Therefore,
⎧
⎪0
⎪
⎪
⎪
⎪
⎪ (1 + t)2
⎪
⎪
⎪
⎨
2
P (X ≤ t) =
⎪ 2 − (1 − t)2
⎪
⎪
⎪
⎪
⎪
2
⎪
⎪
⎪
⎩1
and hence ⎧
⎪1 + t
⎨
d
P (X ≤ t) = 1 − t
⎪
dt
⎩
0 t < −1
−1 ≤ t < 0
0≤t <1
t ≥ 1,
−1 ≤ t < 0
0≤t <1
otherwise. This shows that fX (t), the probability density function of X is given by fX (t) = 1 − t ,
−1 ≤ t ≤ 1; 0, elsewhere. 26. Clearly,
P (Z ≤ z) = f (x, y) dx dy. {(x,y) : y/x ≤z} Now for x > 0, y/x ≤ z if and only if y ≤ xz; for x < 0, y/x ≤ z if and only if y ≥ xz.
Therefore, integration region is
(x, y) : x < 0, y ≥ xz ∪ (x, y) : x > 0, y ≤ xz .
Thus
P (Z ≤ z) = 0
−∞ ∞
xz ∞ f (x, y) dy d x + xz f (x, y) dy d x.
0 −∞ Section 8.1 Joint Distributions of Two Random Variables 165 Using the substitution y = tx , we get
P (Z ≤ z) =
=
=
= −∞ 0
−∞
0 −∞
z −∞
∞ −∞
z −∞ −∞ z xf (x, tx) dt d x z −∞
0 ∞ xf (x, tx) dt d x +
0 ∞ z −xf (x, tx) dt d x + −∞
z xf (x, tx) dt d x
∞ −∞
z z −∞
∞ −∞ −∞ 0 x f (x, tx) dt d x +
0 x f (x, tx) dt d x = x f (x, tx) du d x
x f (x, tx) dx d t. Differentiating with respect to z, Fundamental Theorem of Calculus implies that,
fZ (z) = d
P (Z ≤ z) =
dz ∞
−∞ x f (x, xz) dx. 27. Note that there are exactly n such closed semicircular disks because the probability that the
diameter through Pi contains any other point Pj is 0. (Draw a ﬁgure.) Let E be the event that
all the points are contained in a closed semicircular disk. Let Ei be the event that the points
are all in Di . Clearly, E = ∪n=1 Ei . Since there is at most one Di , 1 ≤ i ≤ n, that contains all
i
the Pi ’s, the events E1 , E2 , . . . , En are mutually exclusive. Hence
n n Ei = P (E) = P
i =1 n P (Ei ) =
i =1 i =1 1
2 n−1 =n 1
2 n−1 , where the nexttothelast equality follows because P (Ei ) is the probability that P1 , P2 ,
. . . , Pi −1 , Pi +1 , . . . , Pn fall inside Di . The probability that any of these falls inside Di is
(area of Di )/(area of the disk) = 1/2 independently of the others. Hence the probability that
all of them fall inside Di is (1/2)n−1 . 28. We have that
(α + β + γ ) 1−x α−1 β −1
x y (1 − x − y)γ −1 dy
(α) (β) (γ ) 0
1−x
1
x α−1
y β −1 (1 − x − y)γ −1 dy.
=
B(α, β + γ )B(β, γ )
0 fX (x) = Let z = y/(1 − x); then dy = (1 − x) dz, and
1−x 1 y β −1 (1 − x − y)γ −1 dy = (1 − x)β +γ −1 0 zβ −1 (1 − z)γ −1 dz = (1 − x)β +γ −1 B(β, γ ). 0 So
1
x α−1 (1 − x)β +γ −1 B(β, γ )
B(α, β + γ )B(β, γ )
1
x α−1 (1 − x)β +γ −1 .
=
B(α, β + γ ) fX (x) = 166 Chapter 8 Bivariate Distributions This shows that X is beta with parameters (α, β + γ ). A similar argument shows that Y is
beta with parameters (β, γ + α). 29. It is straightforward to check that f (x, y) ≥ 0, f is continuous and
∞ ∞ −∞ −∞ f (x, y) dx dy = 1. ∂F
Therefore, f is a continuous probability density function. We will show that
does not
∂x
∂F
exist at (0, 0). Similarly, one can show that
does not exist at any point on the y axis. Note
∂x
that for small x > 0,
F ( x, 0) − F (0, 0) = P (X ≤ x , Y ≤ 0) − P (X ≤ 0 , Y ≤ 0) = P (0 ≤ X ≤ x , Y ≤ 0) = Now, from the deﬁnition of f (x, y), we must have
y > ln(2 x). Thus, for small x > 0,
F ( x, 0) − F (0, 0) = 0
ln(2 x) x 0 x f (x, y) dx dy.
−∞ 0 x < (1/2)ey or, equivalently, (1 − 2xe−y ) dx dy = ( x)2 − ( x) ln(2 x) + 0 x
.
2 This implies that
F ( x, 0) − F (0, 0)
= lim
x →0+
x →0+
x
lim showing that 8.2 x − ln(2 x) − 1
= ∞,
2 ∂F
does not exist at (0, 0).
∂x INDEPENDENT RANDOM VARIABLES 1. Note that pX (x) = (1/25)(3x 2 + 5), pY (y) = (1/25)(2y 2 + 5). Now pX (1) = 8/25,
pY (0) = 5/25, and p(1, 0) = 1/25. Since p(1, 0) = pX (1)pY (0), X and Y are dependent. 2. Note that
1
p(1, 1) = ,
7
12
3
+=,
77
7
6
15
pY (1) = p(1, 1) + p(2, 1) = + = .
77
7 pX (1) = p(1, 1) + p(1, 2) = Since p(1, 1) = pX (1)pY (1), X and Y are dependent. Section 8.2 Independent Random Variables 167 3. By the independence of X and Y ,
P (X = 1, Y = 3) = P (X = 1)P (Y = 3) = 12 12
·
23 23 3 = 4
.
81 P (X + Y = 3) = P (X = 1, Y = 2) + P (X = 2, Y = 1)
= 12 12
·
23 23 2 + 12
23 2 · 4
12
=.
23
27 4. No, they are not independent because, for example, P (X = 0  Y = 8) = 1 but P (X = 0) = 39
8
52
8 = 0.08175 = 1, showing that P (X = 0  Y = 8) = P (X = 0). 5. The answer is 7
2 1
2 2 1
2 5 · 8
2 1
2 2 1
2 6 = 0.0179. 6. We have that
P max(X, Y ) ≤ t = P (X ≤ t, Y ≤ t) = P (X ≤ t)P (Y ≤ t) = F (t)G(t).
P min(X, Y ) ≤ t = 1 − P min(X, Y ) > t
= 1 − P (X > t, Y > t) = 1 − P (X > t)P (Y > t)
= 1 − 1 − F (t) 1 − G(t) = F (t) + G(t) − F (t)G(t). 7. Let X and Y be the number of heads obtained by Adam and Andrew, respectively. The desired
probability is
n n P (X = i, Y = i) =
i =0 P (X = i)P (Y = i)
i =0
n =
i =0 = 1
2 1
2 n
i
2n n
i =0 where the last equality follows by Example 2.28. i 1
2 n
i n−i 2 = ·
1
2 n
i 1
2 2n 2n
,
n i 1
2 n−i 168 Chapter 8 Bivariate Distributions An Intuitive Solution: Let Z be the number of tails obtained by Andrew. The desired probability is
n n n P (X = i, Y = i) =
i =0 P (X = i, Z = i) =
i =0 P (X = i, Y = n − i)
i =0 = P (Adam and Andrew get a total of n heads)
1 2n 2 n
= P ( n heads in 2n ﬂips of a fair coin) =
.
2
n 8. For i, j ∈ 0, 1, 2, 3 , the sum of the numbers in the i th row is pX (i) and the sum of the
numbers in the j th row is pY (j ). We have that
pX (0) = 0.41, pX (1) = 0.44, pX (2) = 0.14, pX (3) = 0.01; pY (0) = 0.41, pY (1) = 0.44, pY (2) = 0.14, pY (3) = 0.01. Since for all x, y ∈ 0, 1, 2, 3 , p(x, y) = pX (x)pY (y), X and Y are independent. 9. They are not independent because
x fX (x) = 2 dy = 2x, 0 ≤ x ≤ 1; 0
1 fY (y) = 2 dx = 2(1 − y), 0 ≤ y ≤ 1; y and so f (x, y) = fX (x)fY (y). 10. Let X and Y be the amount of cholesterol in the ﬁrst and in the second sandwiches, respectively.
Since X and Y are continuous random variables, P (X = Y ) = 0 regardless of what the
probability density functions of X and Y are. 11. We have that
∞ fX (x) =
0 fY (y) =
0 ∞ x 2 e−x(y +1) dy = xex ,
x 2 e−x(y +1) dx = x ≥ 0; 2
,
(y + 1)3 y ≥ 0, where the second integral is calculated by applying integration by parts twice. Now since
f (x, y) = fX (x)fY (y), X and Y are not independent. Section 8.2 Independent Random Variables 169 12. Clearly,
1 E(XY ) =
0 0
1 E(Y ) = 1 (xy)(8xy) dy dx =
0 x
1 E(X) = 1 0 1
x 4
8y 2 dy x 2 dx = ,
9 1 8
x(8xy) dy dx = ,
15
x
1
4
y(8xy) dy dx = .
5
x So E(XY ) = E(X)E(Y ). 13. Since
f (x, y) = e−x · 2e−2y = fX (x)fY (y),
X and Y are independent exponential random variables with parameters 1 and 2, respectively.
Therefore,
1
E(X 2 Y ) = E(X2 )E(Y ) = 2 · = 1.
2 14. The joint probability density function of X and Y is given by
f (x, y) = e−(x +y)
0 x > 0, y > 0
elsewhere. Let G be the probability distribution function, and g be the probability density function of
X/Y . For t > 0,
G(t) = P X
≤ t = P (X ≤ tY )
Y
∞ =
0 ty e−(x +y) dx d y = 0 t
.
1+t Therefore, for t > 0,
g(t) = G (t) = 1
.
(1 + t)2 Note that G (t) = 0 for t < 0; G (0) does not exist. 15. Let F and f be the probability distribution and probability density functions of max(X, Y ),
respectively. Clearly,
F (t) = P max(X, Y ) ≤ t = P (X ≤ t, Y ≤ t) = (1 − e−t )2 ,
Thus
f (t) = F (t) = 2e−t (1 − e−t ) = 2e−t − 2e−2t . t ≥ 0. 170 Chapter 8 Bivariate Distributions Hence ∞ E max(X, Y ) = 2
0
∞ Note that ∞ te−t dt − 2te−2t dt = 2 − 0 1
3
=.
2
2 te−t dt is the expected value of an exponential random variable with parameter 0 ∞ 1, thus it is 1. Also, 2te−2t dt is the expected value of an exponential random variable 0 with parameter 2, thus it is 1/2. 16. Let F and f be the probability distribution and probability density functions of max(X, Y ).
For −1 < t < 1,
F (t) = P max(X, Y ) ≤ t = P (X ≤ t, Y ≤ t) = P (X ≤ t)P (Y ≤ t) =
Thus
f (t) = F (t) =
Therefore, t +1
,
2 1 E(X) = t
−1 t +1
2 2 . −1 < t < 1. 1
t +1
dt = .
2
3 17. Let F and f be the probability distribution and probability density functions of XY , respectively. Clearly, for t ≤ 0, F (t) = 0 and for t ≥ 1, F (t) = 1. For 0 < t < 1,
1 F (t) = P (XY ≤ t) = 1 − P (XY > t) = 1 −
t Hence
f (t) = F (t) = − ln t
0 1 dy dx = t − t ln t. t/x 0<t <1
elsewhere. 18. The joint probability density function of X and Y is given by
f (x, y) =
Now ⎧
⎪
⎨ 1
1
=
area (R)
π ⎪
⎩0 √ fX (x) = if (x, y) ∈ R
otherwise. 1−x 2 √
− 1−x 2 1
2
dy =
π
π 1 − x2, 1
2
dx =
π
π 1 − y2. √
fY (y) = 1−y 2 √ − 1−y 2 Since f (x, y) = fX (x)fY (y), the random variables X and Y are not independent. Section 8.2 Independent Random Variables 171 19. Let X be the number of adults and Y be the number of children who get sick. The desired
probability is
5 6 5 6 P (Y = i, X = j ) =
i =0 j =i +1
5 P (Y = i)P (X = j )
i =0 j =i +1 6 =
i =0 j =i +1 6
6
(0.30)i (0.70)6−i ·
(0.2)j (0.8)6−j = 0.22638565.
i
j 20. Let X be the lifetime of the mufﬂer Elizabeth buys from company A and Y be the lifetime of
the mufﬂer she buys from company B. The joint probability density function of X and Y is
h(x, y) = f (x)g(y), x > 0, y > 0. So the desired probability is
∞ P (Y > X) =
0 ∞
x 2 −(2y)/11
1
11
dy e−x/6 dx = .
e
11
6
23 21. If IA and IB are independent, then
P (IA = 1, IB = 1) = P (IA = 1)P (IB = 1).
This is equivalent to P (AB) = P (A)P (B) which shows that A and B are independent. On
the other hand, if {A, B } is an independent set, so are the following: A, B c , Ac , B , and
Ac , B c . Therefore,
P (AB) = P (A)P (B), P (AB c ) = P (A)P (B c ), P (Ac B) = P (Ac )P (B), P (Ac B c ) = P (Ac )P (B c ). These relations, respectively, imply that
P (IA = 1, IB = 1) = P (IA = 1)P (IB = 1),
P (IA = 1, IB = 0) = P (IA = 1)P (IB = 0),
P (IA = 0, IB = 1) = P (IA = 0)P (IB = 1),
P (IA = 0, IB = 0) = P (IA = 0)P (IB = 0).
These four relations show that IA and IB are independent random variables. 22. The joint probability density function of B and C is
⎧ 22
⎪ 9b c
⎨
676
f (b, c) =
⎪
⎩0 1 < b < 3, 1 < c < 3
otherwise. For X2 + BX + C to have two real roots we must have B 2 − 4C > 0, or, equivalently, B 2 > 4C .
Let
E = (b, c) : 1 < b < 3, 1 < c < 3, b2 > 4c ; 172 Chapter 8 Bivariate Distributions the desired probability is
9b2 c2
db dc =
676
E b2 /4 3
2 1 9b2 c2
dc d b ≈ 0.12.
676 (Draw a ﬁgure to verify the region of integration.) 23. Note that
fX (x) =
fY (y) = ∞
−∞
∞
−∞ g(x)h(y) dy = g(x)
g(x)h(y) dx = h(y) ∞ h(y) dy,
−∞
∞ g(x) dx.
−∞ Now
∞ fX (x)fY (y) = g(x)h(y)
= f (x, y)
= f (x, y) ∞ h(y) dy
−∞ g(x) dx
−∞ ∞ ∞ −∞ −∞ ∞ ∞ −∞ −∞ h(y)g(x) dy dx
f (x, y) dy dx = f (x, y). This relation shows that X and Y are independent. 24. Let G and g be the probability distribution and probability density functions of
max(X, Y ) min(X, Y ). Then G(t) = 0 if t < 1. For t ≥ 1,
G(t) = P max(X, Y )
≤ t = P max(X, Y ) ≤ t min(X, Y )
min(X, Y ) = P X ≤ t min(X, Y ), Y ≤ t min(X, Y )
= P min(X, Y ) ≥ X
Y
, min(X, Y ) ≥
t
t =P X≥ X
X
Y
Y
, Y ≥ , X≥ , Y ≥
t
t
t
t =P Y ≥ X
Y
, X≥
t
t =P X
≤ Y ≤ tX .
t This quantity is the area of the region
(x, y) : 0 < x < 1, 0 < y < 1, x
≤ y ≤ tx
t Section 8.2 Independent Random Variables 173 which is equal to (t − 1)/t. Hence
G(t) = and therefore, ⎧
⎪0
⎨ t <1 ⎪t − 1
⎩
t t ≥ 1, ⎧
⎪1
⎨
2
g(t) = G (t) = t
⎪
⎩0 t ≥1
elsewhere. 25. Let F be the distribution function of X/(X + Y ). Since X/(X + Y ) ∈ (0, 1), we have that
0
1 F (t) = t <0
t ≥ 1. For 0 ≤ t < 1,
P X
1−t
≤t =P Y ≥
X = λ2
X+Y
t
∞ =λ ∞ ∞ ∞ e−λx e−[λ(1−t)x ]/t dx = λ 0 Therefore, e−λx e−λy dy dx [(1−t)x ]/t 0 e−λx/t dt = t. 0 ⎧
⎪0
⎨
F (t) = t
⎪
⎩
1 t <0
0≤t <1
t ≥ 1. This shows that X/(X + Y ) is uniform over (0, 1). 26. The fact that if X and Y are both normal with mean 0 and equal variance implies that f (x, y) is
circularly symmetrical is straightforward. We prove the converse; suppose that f is circularly
symmetrical, then there exists a function ϕ so that
fX (x)fY (y) = ϕ x2 + y2 . Differentiating this relation with respect to x and using
fY (y) = fX (x)fY (y)
=ϕ
fX (x) x 2 + y 2 /fX (x) yields
ϕ
ϕ x2 + y2 x2 + y2 x2 + y2 = fX (x)
.
xfX (x) 174 Chapter 8 Bivariate Distributions Now the right side of this equation is a function of x while its left side is a function of x 2 + y 2 .
This implies that fX (x)/ xfX (x) is constant. To prove this, we show that for any given x1
and x2 ,
fX (x1 )
f (x2 )
=X
.
x1 fX (x1 )
x2 fX (x2 )
2
2
2
2
Let y1 = x2 and y2 = x1 ; then x1 + y1 = x2 + y2 and we have fX (x1 )
=
x1 fX (x1 ) ϕ
ϕ 2
2
x1 + y 1 2
2
x1 + y1 2
2
x1 + y1 ϕ = 2
2
x2 + y2 2
2
x2 + y2 ϕ 2
2
x2 + y2 = fX (x2 )
.
x2 fX (x2 ) We have shown that for some constant k ,
fX (x)
= k.
xfX (x)
Therefore, fX (x)
1
= kx and hence ln fX (x) = kx 2 + c, or
fX (x)
2
fX (x) = e(1/2)kx where α = ec . Now since
then fX (x) = αe−x 2 /(2σ 2 ) ∞
−∞ 2 +c 2 = αe(1/2)kx , 2 αe(1/2)kx dx = 1, we have that k < 0. Let σ =
∞ and αe−x 2 /(2σ 2 ) −∞ −1/k ; √
dx = 1 implies that α = 1/(σ 2π ). So 1
2
2
fX (x) = √ e−x /(2σ ) , showing that X ∼ N(0, σ 2 ). The fact that Y ∼ N(0, σ 2 ) is
σ 2π
proved similarly. 8.3 CONDITIONAL DISTRIBUTIONS
2 1. pY (y) = p(x, y) =
x =1 pXY (x y) = 1
(2y 2 + 5). Thus
25 p(x, y)
(1/25)(x 2 + y 2 )
x2 + y2
=
=2
x = 1, 2, y = 0, 1, 2,
pY (y)
(1/25)(2y 2 + 5)
2y + 5 P (X = 2  Y = 1) = pXY (21) = 5/7,
2 E(X Y = 1) = 2 xpXY (x 1) =
x =1 x
x =1 x2 + 1
12
=
.
7
7 Section 8.3 2. Since Conditional Distributions 175 y fY (y) = 2 dx = 2y, 0 < y < 1,
0 we have that
fXY (x y) = 2
1
f (x, y)
=
= , 0 < x < y, 0 < y < 1.
fY (y)
2y
y 3. Let X be the number of ﬂips of the coin until the sixth head is obtained. Let Y be the number
of ﬂips of the coin until the third head is obtained. Let Z be the number of additional ﬂips
of the coin after the third head occurs until the sixth head occurs; Z is a negative binomial
random variable with parameters 3 and 1/2. By the independence of the trials,
x−6
2 pXY (x 5) = P (Z = x − 5) =
= 4. Note that
fXY x x−6
2 1
2 x −5 1
2 3 1
2 x −8 , x = 8, 9, 10, . . . . 3 x 2 + (9/16)
3
1
=
= (48x 2 + 27).
4
(27/16) + 1
43 Therefore,
P 1
1
3
<X<
Y=
=
4
2
4 1/2
1/4 17
1
(48x 2 + 27) dx = .
43
86 5. In the discrete case, let p(x, y) be the joint probability mass function of X and Y , and let A
be the set of possible values of X . Then
E(X  Y = y) = x
x ∈A p(x, y)
=
pY (y) x ∈A xpX (x)pY (y)
=
pY (y) xpX (x) = E(X).
x ∈A In the continuous case, letting f (x, y) be the joint probability density function of X and Y ,
we get
E(X  Y = y) =
= 6. Since
fY (y) = ∞
−∞ ∞ x
−∞
∞
−∞ f (x, y)
dx =
fY (y) ∞
−∞ xfX (x)fY (y)
dx
fY (y) xfX (x) dx = E(X). 1 f (x, y) dx =
0 (x + y) dx = 1
+ y,
2 176 Chapter 8 Bivariate Distributions the desired quantity is given by
⎧
⎪ x+y
⎨
fXY (x y) = (1/2) + y
⎪
⎩0 7. Clearly, ∞ fY (y) = 0 ≤ x ≤ 1, 0 ≤ y ≤ 1
elsewhere. e−x(y +1) dx = 0 1
, 0 ≤ y ≤ e − 1.
y+1 Therefore,
∞ E(X  Y = y) = −∞
∞ =
0 ∞ xfXY (x y) dx =
0 xf (x, y)
dx
fY (y) 1
xe−x(y +1)
dx =
.
1/(y + 1)
y+1 ∞ Note that, the last integral, 0 x(y + 1)e−x(y +1) dx is 1/(y + 1) because it is the expected
value of an exponential random variable with parameter y + 1. 8. Let f (x, y) be the joint probability density function of X and Y . Clearly,
f (x, y) = fXY (x y)fY (y).
Thus ∞ fX (x) = −∞ fXY (x y)fY (y) dy. Now
fY (y) =
and
fXY (x y) = ⎧
⎪
⎨ 1
1−y ⎪
⎩0 1
0 0<y<1
elsewhere,
0 < y < 1, y < x < 1
elsewhere. Therefore, for 0 < x < 1,
x fX (x) =
0 1
dy = − ln(1 − x),
1−y and hence
fX (x) = − ln(1 − x)
0 0<x<1
elsewhere. Section 8.3 Conditional Distributions 177 9. f (x, y), the joint probability density function of X and Y is given by
⎧
⎪1
⎨
f (x, y) = π
⎪
⎩0
Thus
4
=
5 fY
Now √ if x 2 + y 2 ≤ 1
otherwise. 1−(16/25) √
− 1−(16/25) 4
f x,
4
5
=
4
5
fY
5 fXY x 1
6
dx =
.
π
5π 5
=,
6 3
3
− ≤x≤ .
5
5 Therefore,
P 0≤X≤
∞ x 10. (a)
0 −x 4
4
y=
=
11
5 4/11
0 10
5
dx =
.
6
33 ce−x dy dx = 1 implies that c = 1/2. (b) fXY (x y) = (1/2)e−x f (x, y)
=
fY (y) ∞ (1/2)e −x = e−x +y  , x > y , dx y  fY X (y x) = (1/2)e−x
x (1/2)e−x dy = 1
, −x < y < x.
2x −x (c) By part (b), given X = x , Y is a uniform random variable over (−x, x). Therefore,
E(Y X = x) = 0 and
2
x − (−x)
x2
Var(Y X = x) =
=.
3
12 11. Let f (x, y) be the joint probability density function of X and Y . Since
fXY (x y) = ⎧
⎪
⎨ 3
1
=
2y
20 + (2y)/3 − 20 ⎪
⎩
0 20 < x < 20 +
otherwise, and
fY (y) = ⎧
⎨1/30 0 < y < 30 ⎩0 elsewhere, 2y
3 178 Chapter 8 Bivariate Distributions we have that
⎧
⎪1
⎨
f (x, y) = fXY (x y)fY (y) = 20y
⎪
⎩0 20 < x < 20 + 2y
, 0 < y < 30
3 elsewhere. 12. Let X be the ﬁrst arrival time. Clearly,
P X ≤ x  N(t) = 1 = 0
1 if x < 0
if x ≥ t . For 0 ≤ x < t ,
P X ≤ x  N(t) = 1 = P X ≤ x, N (t) = 1
P N (t) = 1 = P N (x) = 1, N (t − x) = 0
P N (t) = 1 e−λx (λx)1 e−λ(t −x) λ(t − x)
·
P N (x) = 1 P N (t − x) = 0
1!
0!
=
=
−λt
1
P N (t) = 1
e (λt)
1! 0 = x
,
t where the third equality follows from the independence of the random variables N(x) and
N(t − x) (recall that Poisson processes possess independent increments). We have shown that
⎧
⎪0
if x < 0
⎪
⎪
⎨
P X ≤ x  N (t) = 1 = x/t if 0 ≤ x < t
⎪
⎪
⎪
⎩
1
if x ≥ t.
This shows that the conditional distribution of X given N(t) = 1 is uniform on (0, 1). 13. For x ≤ y , the fact that the conditional distribution of X given Y = y is hypergeometric
follows from the following:
P (X = x  Y = y) = = P (X = x, Y = y)
P (X = x)P (Y − X = y − x)
=
P (Y = y)
P (Y = y) n − m y −x
mx
p (1 − p)(n−m)−(y −x)
p (1 − p)m−x ·
y−x
x
=
ny
p (1 − p)n−y
y m
x n−m
y−x
.
n
y Section 8.3 Conditional Distributions 179 It must be clear that the conditional distribution of Y given that X = x is binomial with
parameters n − m and p . That is,
n − m y −x
p (1 − p)n−m−y +x ,
y−x P (Y = y  X = x) = y = x, x + 1, . . . , n − m + x. 14. Let f (x, y) be the joint probability density function of X and Y . By the solution to Exercise 25,
Section 8.1,
f (x, y) = ⎧
⎨1/2 x  + y  ≤ 1 ⎩0 elsewhere, and
fY (y) = 1 − y , −1 ≤ y ≤ 1.
Hence
fXY (x y) = 1
1/2
=
, −1 + y  ≤ x ≤ 1 − y , −1 ≤ y ≤ 1.
1 − y 
2 1 − y  15. Let λ be the parameter of N (t) : t ≥ 0 . The fact that for s < t , the conditional distribution of N(s) given N(t) = n is binomial with parameters n and p = s/t , follows from the following
relations for i ≤ n.
P N (s) = i  N(t) = n = = = = P N (s) = i, N (t) = n
P N (t) = n P N (s) = i, N(t) − N (s) = n − i
P N (t) = n P N (s) = i P N (t − s) = n − i
P N (t) = n n
i s
t i 1− s
t n−i = = P N (s) = i P N (t) − N (s) = n − i
P N (t) = n e−λs (λs)i e−λ(t −s) λ(t − s)
·
i!
(n − i)! n−i e−λt (λt)n
n! , where the third equality follows since Poisson processes possess independent increments and
the fourth equality follows since Poisson processes are stationary. 180 Chapter 8 Bivariate Distributions For i ≥ k ,
P N (t) = i  N(s) = k = P N (t) − N(s) = i − k  N(s) = k
= P N (t) − N (s) = i − k = P N (t − s) = i − k
e−λ(t −s) λ(t − s)
=
(i − k)! i −k shows that the conditional distribution of N(t) given N (s) = k is Poisson with parameter
λ(t − s). 16. Let p(x, y) be the joint probability mass function of X and Y . Clearly,
12
13 pY (5) =
and ⎧
⎪ 11
⎪
⎪
⎪
⎪ 13
⎨
p(x, 5) = 0
⎪
⎪
⎪
⎪ 11
⎪
⎩
13 x −1 1
13 4 1
,
13 12
13 4−x 1
13 x<5
x=5 1
13 4 x −6 12
13 1
13 x > 5. Using these, we have that
∞ ∞ xpXY (x 5) = E(X  Y = 5) =
x =1
4 =
x =1 x
x =1 11
1
x
11 12 = 0.72932 +
= 0.72932 + x ∞ + x
x =6 11
12 4 11
12 4 11
= 0.702932 +
12 1
13
1
13
4 1
13 p(x, 5)
pY (5)
11
12 4 ∞ (y + 6)
y =0
∞ y
y =0 12
13 x −6 12
13 1
13 12
13
y y ∞ +6
y =0 12
13 y 12/13
1
= 13.412.
+6
2
(1/13)
1 − (12/13) Remark: In successive draws of cards from an ordinary deck of 52 cards, one at a time,
randomly, and with replacement, the expected value of the number of draws until the ﬁrst ace
is 1/(1/13) = 13. This exercise shows that knowing the ﬁrst king occurred on the ﬁfth trial
will increase, on the average, the number of trials until the ﬁrst ace 0.412 draws. Section 8.3 Conditional Distributions 181 17. Let X be the number of blue chips in the ﬁrst 9 draws and Y be the number of blue chips drawn
altogether. We have that
9 E(X  Y = 10) = x
x =0
9 = x p(x, 10)
pY (10)
9
x 12
22 x 10
22 18
10 x =1 9 = x 9−x 9
x x =1 9
10 − x
18
10 = · 9
12
10 − x 22
12 10 10 8
22
22 10−x 10
22 x −1 9 × 10
= 5,
18 where the last sum is (9 × 10)/18 because it is the expected value of a hypergeometric random
variable with N = 18, D = 9, and n = 10. 18. Clearly,
1 fX (x) = n(n − 1)(y − x)n−2 dy = n(1 − x)n−1 . x Thus
fY X (y x) = f (x, y)
n(n − 1)(y − x)n−2
(n − 1)(y − x)n−2
=
=
.
fX (x)
n(1 − x)n−1
(1 − x)n−1 Therefore,
1 E(Y  X = x) = y
x (n − 1)(y − x)n−2
n−1
dy =
n−1
(1 − x)
(1 − x)n−1 1 y(y − x)n−2 dy. x But
1 1 y(y − x)n−2 dy = x (y − x + x)(y − x)n−2 dy x 1 =
x =
Thus
E(Y  X = x) = 1 (y − x)n−1 dy + x(y − x)n−2 dy x (1 − x)n x(1 − x)n−1
+
.
n
n−1
n−1
n−1 1
(1 − x) + x =
+ x.
n
n
n 182 Chapter 8 Bivariate Distributions 19. (a) The area of the triangle is 1/2. So
f (x, y) =
1−y (b) fY (y) = 2
0 if x ≥ 0, y ≥ 0, x + y ≤ 1
elsewhere. 2 dx = 2(1 − y), 0 < y < 1. Therefore, 0 fXY (x y) = 1
2
=
, 0 ≤ x ≤ 1 − y, 0 ≤ y < 1.
2(1 y)
1−y (c) By part (b), given that Y = y , X is a uniform random variable over (0, 1 − y). Thus
E(X  Y = y) = (1 − y)/2, 0 < y < 1. 20. Clearly,
x pX (x) =
y =0 1
1
=2
2 y ! (x − y)!
e
e x! x
y =0 x!
e−2
=
y ! (x − y)!
x! x
y =0 x
y = e−2 · 2x
,
x! x
is the number of subsets of a set with x
y
elements and hence is equal to 2x . Therefore, pX (x) is Poisson with parameter 2 and so where the last equality follows since x
y =0 pY X (y x) = p(x, y)
x −x
=
2.
pX (x)
y This yields
x x −x
E(Y  X = x) =
y
2=
y
y =0 x y
y =0 x
y 1
2 y 1
2 x −y = x
,
2 where the last equality follows because the last sum is the expected value of a binomial random
variable with parameters x and 1/2. 21. Let X be the lifetime of the dead battery. We want to calculate E(X  X < s). Since X is a
continuous random variable, this is the same as E(X  X ≤ s). To ﬁnd this quantity, let
FXX≤s (t) = P (X ≤ t  X ≤ s),
and fXX≤s (t) = FXX≤s (t). Then
∞ E(X  X ≤ s) = tfXX≤s (t) dt.
0 Section 8.4 Transformations of Two Random Variables Now
P (X ≤ t, X ≤ s)
FXX≤s (t) = P (X ≤ t  X ≤ s) =
P (X ≤ s)
⎧
P (X ≤ t)
⎨
if t < s
= P (X ≤ s)
⎩
1
if t ≥ s .
Differentiating FXX≤s (t) with respect to t , we obtain
⎧
⎨ f (t) if t < s
fXX≤s (t) = F (s)
⎩
0
otherwise.
This yields
E(X  X ≤ s) = 8.4 1
F (s) s tf (t) dt.
0 TRANSFORMATIONS OF TWO RANDOM VARIABLES 1. Let f be the joint probability density function of X and Y . Clearly,
f (x, y) = 1
0 0 < x < 1, 0 < y < 1
elswhere. The system of two equations in two unknowns
−2 ln x = u
−2 ln y = v
deﬁnes a onetoone transformation of
R = (x, y) : 0 < x < 1, 0 < y < 1
onto the region
Q = (u, v) : u > 0, v > 0 .
It has the unique solution x = e−u/2 , y = e−v/2 . Hence J= 1
− e−u/2
2 0 1
= e−(u+v)/2 = 0.
4 1
− e−v/2
2
By Theorem 8.8, g(u, v), the joint probability density function of U and V is
0 g(u, v) = f e−u/2 , e−v/2 1
1 −(u+v)/2
= e−(u+v)/2 , u > 0, v > 0.
e
4
4 183 184 Chapter 8 Bivariate Distributions 2. Let f (x, y) be the joint probability density function of X and Y . Clearly,
f (x, y) = f1 (x)f2 (y), x > 0, y > 0. Let V = X and g(u, v) be the joint probability density functions of U and V . The probability
density function of U is gU (u), its marginal density function. The system of two equations in
two unknowns
x /y = u
x=v
deﬁnes a onetoone transformation of
R = (x, y) : x > 0, y > 0
onto the region
Q = (u, v) : u > 0, v > 0 .
It has the unique solution x = v , y = v/u. Hence
0
J=
− v
u2 1
1
u = v
= 0.
u2 By Theorem 8.8,
g(u, v) = f v , v
u v
v
v
v
v
= 2 f v,
= 2 f1 (v)f2
2
u
u
u
u
u Therefore, ∞ gU (u) =
0 v
v
f1 (v)f2
d v,
u2
u u > 0, v > 0. u > 0. 3. Let g(r, θ ) be the joint probability density function of R and . We will show that g(r, θ ) =
gR (r)g (θ). This proves the surprising result that R and are independent. Let f (x, y) be
the joint probability density function of X and Y . Clearly,
f (x, y) = 1 −(x 2 +y 2 )/2
,
e
2π −∞ < x < ∞, −∞ < y < ∞. Let R be the entire xy plane excluded the set of points on the x axis with x ≥ 0. This causes
no problems since
P (Y = 0, X ≥ 0) = P (Y = 0)P (X ≥ 0) = 0.
The system of two equations in two unknowns
⎧
⎨ x2 + y2 = r
⎩ arctan y = θ
x Section 8.4 Transformations of Two Random Variables deﬁnes a onetoone transformation of R onto the region
Q = (r, θ ) : r > 0, 0 < θ < 2π .
It has the unique solution
x = r cos θ
y = r sin θ.
Hence
cos θ
J= −r sin θ θ 185 242 Chapter 10 More Expectations and Variances 9. Let f be the probability density function of λ; then
∞ P (N = i) = P (N = i  λ = x)f (x) dx 0
∞ =
0 =
= e−x x i −x
e dx =
i!
∞ 11
i! 2 i 11
i! 2 i +1 ∞
0 e−2x x i
dx
i! e−2x (2x)i dx 0
∞ 1
2 e−u ui du = 0 i +1 . In these calculations, we have used the substitution u = 2x and the relation
∞ e−u ui du = i !. 0 10. Suppose that player A carries x dollars in his wallet. Then player A wins if and only if player
B carries y dollars, y ∈ (x, 1] in his wallet. Thus player A wins y dollars with probability
1 − x . In such a case, the expected amount player A wins is (1 + x)/2. Player A loses x dollars
with probability x . Therefore,
13
1+x
· (1 − x) + (−x) · x = − x 2 .
2
22
Let fX be the probability density function of X , then
E(WA  X = x) = fX (x) = 1
0 if 0 ≤ x ≤ 1
otherwise. Therefore,
1 E(WA ) = E E (WA  X) = E(WA  X = x)fX (x) dx 0
1 =
0 1
1 32
1
− x dx = x − x3
22
2
2 1
0 = 0. The solution above was presented by Kent G. Merryﬁeld, Ngo Viet, and Saleem Watson in
their joint paper "The Wallet Paradox" published in the AugustSeptember 1977 issue of the
American Mathematical Monthly. Note the following observations by the authors.
It is interesting to consider special cases of this formula for the conditional expectation. Since E(WA  X = 1) = −1 and E(WA  X = 0) = 1/2, we see that a player
carrying one dollar in his wallet should expect to lose it, whereas a player carrying
nothing in his wallet should expect to gain half a dollar (the mean). Interestingly, if a
player is carrying half a dollar (the mean) in his wallet, then E(WA  X = 1/2) = 1/8;
that is, his expectation of winning is positive. Section 10.4 Conditioning on Random Variables 243 11. (a) To derive the relation
1
1
E(Kn  Kn−1 = i) = (i + 1) + i + 1 + E(Kn )
2
2
1
= (i + 1) + E(Kn ),
2
we noted the following. It took i tosses of the coin to obtain n − 1 consecutive heads. If the
result of the next toss is heads, we have the desired n consecutive heads. This occurs with
probability 1/2. However, if the result of the next toss is tails, then, on the average, we need
an additional E(Kn ) tosses [a total of i + 1 + E(Kn ) tosses] to obtain n consecutive heads.
This also happens with probability 1/2.
(b) From (a) it should be clear that
1
E(Kn  Kn−1 ) = (Kn−1 + 1) + E(Kn ).
2
(c) Finding the expected values of both sides of (b) yields
1
E(Kn ) = E(Kn−1 ) + 1 + E(Kn ).
2
Solving this for E(Kn ), we obtain
E(Kn ) = 2 + 2E(Kn−1 ).
(d) Note that K1 is a geometric random variable with parameter 1/2. Thus E(K1 ) = 2. Solving
E(Kn ) = 2 + 2E(Kn−1 ) recursively, we get
E(Kn ) = 2 + 22 + 23 + · · · + 2n = 2(1 + 2 + · · · + 2n−1 )
2n − 1
=2·
= 2(2n − 1).
2−1 12. Suppose that the last tour left at time 0. Let X be the time from 0 until the next guided tour
begins. Let S10 be the time from 0 until 10 new tourists arrive. The random variable S10 is
gamma with parameters λ = 1/5 and n = 10. Let F and f be the probability distribution and
density functions of S10 . Then, for t ≥ 0,
(t/5)9
1
.
f (t) = e−t/5
5
9!
To ﬁnd E(X), note that
E(X) = E(X  S10 < 60)P (S10 < 60) + E(X  S10 ≥ 60)P (S10 ≥ 60)
= E(S10  S10 < 60)P (S10 < 60) + 60P (S10 ≥ 60). 244 Chapter 10 More Expectations and Variances Now 60 P (S10 < 60) =
0 1 −t/5 (t/5)9
e
dt = 0.7576,
5
9! and, by Remark 8.1,
1
F (60)
1
=
0.7576 60 E(S10  S10 < 60) = tf (t) dt
0
60
0 1 −t/5 (t/5)9
te
dt = 43.0815.
5
9! Therefore,
E(X) = (43.0815)(0.7576) + 60(1 − 0.7576) = 47.18.
This shows that the expected length of time between two consecutive tours is approximately
47 minutes and 10 seconds. 13. Let X1 be the time until the ﬁrst application arrives. Let X2 be the time between the ﬁrst and
second applications, and so forth. Then Xi ’s are independent exponential random variables
with mean 1/λ = 1/5 of a day. Let N be the ﬁrst integer for which
X1 ≤ 2, X2 ≤ 2, . . . , XN ≤ 2, XN +1 > 2.
The time that the admissions ofﬁce has to wait before doubling its student recruitment efforts
is SN +1 = X1 + X2 + · · · + XN +1 . Therefore,
∞ E(SN +1 ) = E E (SN +1  N ) = E(SN +1  N = i)P (N = i).
i =0 Now, for i ≥ 0,
i +1 E(SN +1  N = i) = E(X1 + X2 + · · · + Xi +1  N = i) = E(Xj  N = i)
j =1 i E(Xj  Xj ≤ 2) + E(Xi +1  Xi +1 > 2), =
j =1 where by Remark 8.1,
E(Xj  Xj ≤ 2) =
E(Xi +1  Xi +1 > 2) = 1
F (2) 2 tf (t) dt,
0 1
1 − F (2) ∞ tf (t) dt,
2 Section 10.4 Conditioning on Random Variables 245 F and f being the probability distribution and density functions of Xi ’s, respectively. That is,
for t ≥ 0, F (t) = 1 − e−5t , f (t) = 5e−5t . Thus, for 1 ≤ j ≤ i ,
E(Xj  Xj ≤ 2) = 2 1
1 − e−10 5t e−5t dt = (1.0000454) −t − 0 1 −5t
e
5 2
0 = (1.0000454)(0.19999) = 0.1999092
and, for j = i + 1,
E(Xi +1  Xi +1 > 2) =
Thus, for i ≥ 0, ∞ 1
e−10 5t e−5t dt = e10 −t − 2 1 −5t
e
5 ∞
2 = 2.2. E(SN +1  N = i) = (0.1999092)i + 2.2. To ﬁnd P (N = i), note that for i ≥ 0,
P (N = i) = P (X1 ≤ 2, X2 ≤ 2, . . . , Xi ≤ 2, Xi +1 > 2)
= F (2) i 1 − F (2) = (0.9999546)i (0.0000454). Putting all these together, we obtain
∞ E(SN +1 ) = E(SN +1  N = i)P (N = i)
i =0
∞ (0.1999092)i + 2.2 (0.9999546)i (0.0000454) =
i =0 ∞ ∞ i(0.9999546)i + (0.00009988) = (0.00000908)
i =0 (0.9999546)i
i =0 0.9999546
1
= (0.00000908) ·
+ (0.00009988) ·
2
(1 − 0.9999546)
1 − 0.9999546
= 4407.286,
∞
∞
i
i
= r/(1 − r)2 , and
=
where the next to last equality follows from
i =1 ir
i =0 r
1/(1 − r), r  < 1. Since an academic year is 9 months long, and contains approximately
180 business days, the admission ofﬁcers should not be concerned about this rule at all. It
will take 4,407.286 business days, on average, until there is a lapse of two days between two
consecutive applications. 14. Let Xi be the number of calls until Steven has not missed Adam in exactly i consecutive calls.
We have that
E Xi  Xi −1 = Xi −1 + 1
Xi −1 + 1 + E(Xi ) with probability p
with probability 1 − p . 246 Chapter 10 More Expectations and Variances Therefore,
E(Xi ) = E E (Xi  Xi −1 ) = E (Xi −1 ) + 1 p + E (Xi −1 ) + 1 + E(Xi ) (1 − p).
Solving this equation for E(Xi ), we obtain
E(Xi ) = 1
1 + E(Xi −1 ) .
p Now X1 is a geometric random variable with parameter p. So E(X1 ) = 1/p. Thus
E(X2 ) = 1
1
1
1 + E(X1 ) =
1+
,
p
p
p E(X3 ) = 1
1
1
1
1 + E(X2 ) =
1+ + 2 ,
p
p
pp .
.
.
E(Xk ) = 1 (1/p k ) − 1
1
1
1 − pk
1
1
1 + + 2 + · · · + k−1 = ·
=k
.
p
pp
p
p (1/p) − 1
p (1 − p) 15. Let N be the number of games to be played until Emily wins two of the most recent three
games. Let X be the number of games to be played until Emily wins a game for the ﬁrst time.
The random variable X is geometric with parameter 0.35. Hence E(X) = 1/0.35. First, we
ﬁnd the random variable E(N  X) in terms of X. Then we obtain E(N ) by calculating the
expected value of E(N  X). Let W be the event that Emily wins the (X + 1)st game as well.
Let LW be the event that Emily loses the (X + 1)st game but wins the (X + 2)nd game. Let
LL be the event that Emily loses both the (X + 1)st and the (X + 2)nd games. Given X = x ,
we have
E(N  X = x) = (x + 1)P (W ) + (x + 2)P (LW ) + (x + 2) + E(N) P (LL).
So
E(N  X = x) = (x + 1)(0.35) + (x + 2)(0.65)(0.35) + (x + 2) + E(N) (0.65)2 .
This gives
E(N  X = x) = x + (0.4225)E(N ) + 1.65.
Therefore,
E(N  X) = X + (0.4225)E(N ) + 1.65.
Hence
E(N ) = E E (N  X) = E(X) + (0.4225)E(N ) + 1.65 =
Solving this for E(N ) gives E(N) = 7.805. 1
+ (0.4225)E(N ) + 1.65.
0.35 Section 10.4 Conditioning on Random Variables 247 16. Since hemophilia is a sexlinked disease, and John is phenotypically normal, John is H .
Therefore, no matter what Kim’s genotype is, none of the daughters has hemophilia. Whether
a boy has hemophilia or not depends solely on the genotype of Kim. Let X be the number
of the boys who have hemophilia. To ﬁnd, E(X), the expected number of the boys who have
hemophilia, let
⎧
⎪0 if Kim is hh
⎪
⎨
Z = 1 if Kim is H h
⎪
⎪
⎩
2 if Kim is H H .
Then
E(X) = E E (X  Z)
= E(X  Z = 0)P (Z = 0) + E(X  Z = 1)P (Z = 1) + E(X  Z = 2)P (Z = 2)
= 4(0.02)(0.02) + 4(1/2) 2(0.98)(0.02) + 0 0.98)(0.98) = 0.08.
Therefore, on average, 0.08 of the boys and hence 0.08 of the children are expected to have
hemophilia. 17. Let X be the number of bags inspected until an unacceptable bag is found. Let Kn be the number
of consequent bags inspected until n consecutive acceptable bags are found. The number of
bags inspected in one inspection cycle is X + Km . We are interested in E(X + Km ) =
E(X) + E(Km ). Clearly, X is a geometric random variable with parameter α(1 − p). So
E(X) = 1/ α (1 − p) . To ﬁnd E(Km ), note that ∀n,
E(Kn ) = E E (Kn  Kn−1 ) .
Now
E(Kn  Kn−1 = i) = (i + 1)p + i + 1 + E(Kn ) (1 − p)
= (i + 1) + (1 − p)E(Kn ). (41) To derive this relation, we noted the following. It took i inspections to ﬁnd n − 1 consecutive
acceptable bags. If the next bag inspected is also acceptable, we have the n consecutive
acceptable bags required in i + 1 inspections. This occurs with probability p. However, if
the next bag inspected is unacceptable, then, on the average, we need an additional E(Kn )
inspections a total of i + 1 + E(Kn ) inspections until we get n consecutive acceptable bags
of cinnamon. This happens with probability 1 − p .
From (41), we have
E(Kn  Kn−1 ) = (Kn−1 + 1) + (1 − p)E(Kn ).
Finding the expected values of both sides of this relation gives
E(Kn ) = E(Kn−1 ) + 1 + (1 − p)E(Kn ). 248 Chapter 10 More Expectations and Variances Solving for E(Kn ), we obtain
E(Kn ) = 1
E(Kn−1 )
+
.
p
p Noting that E(K1 ) = 1/p and solving recursively, we ﬁnd that
E(Kn ) = 1
1
1
+ 2 + ··· + n.
pp
p Therefore, the desired quantity is
E(X + Km ) = E(X) + E(Km )
= 1
1
1
1
+
1 + + · · · + m−1
α(1 − p) p
p
p 1m
−1
1
1
(1 − α)pm + α
p
=
+·
.
=
1
α(1 − p) p
αpm (1 − p)
−1
p 18. For 0 < t ≤ 1, let N(t) be the number of batteries changed by time t . Let X be the lifetime
of the initial battery used; X is a uniform random variable over the interval (0, 1). Therefore,
fX , the probability density function of X, is given by
fX (x) = 1
0 if 0 < x < 1
otherwise. We are interested in K(t) = E N (t) . Clearly,
E N (t) = E E N (t)  X ∞ = E N (t)  X = x fX (x) dx 0
t t 1 + E N (t − x) = E N (t − x) d x dx = t + 0 0
t =t+ K(u) du,
0 where the last equality follows from the substitution u = t − x . Differentiating both sides of
t
K(t) = t + 0 K(u) du with respect to t , we obtain K (t) = 1 + K(t) which is equivalent to
K (t)
= 1.
1 + K(t)
Thus, for some constant c,
ln 1 + K(t) = t + c, Section 10.4 or, Conditioning on Random Variables 249 1 + K(t) = et +c . The initial condition K(0) = E N (0) = 0 yields ec = 1; so
K(t) = et − 1.
On average, after 950 hours of operation, K(0.95) = 1.586 batteries are used. 19. Since E(XY ) is a function of Y , by Example 10.23,
E(XZ) = E E (XZ Y ) = E E XE(XY )Y
= E E (X Y )E(X Y ) = E(Z 2 ).
Therefore,
E X − E(X Y ) 2 = E (X − Z)2
= E(X2 − 2ZX + Z 2 ) = E(X2 ) − 2E(Z 2 ) + E(Z 2 )
= E(X2 ) − E(Z 2 ) = E(X2 ) − E E (XY )2 . 20. Let Z = E(XY ); then
Var(X Y ) = E (X − Z)2 Y
= E(X2 − 2XZ + Z 2 Y )
= E(X2 Y ) − 2E(XZ Y ) + E(Z 2 Y ).
Since E(X Y ) is a function of Y , by Example 10.23,
E(XZ Y ) = E XE(XY )Y = E(X Y )E(X Y ) = Z 2 .
Also
E(Z 2 Y ) = E E (X Y )2 Y = E(X Y )2 = Z 2
since, in general, E f (Y )Y = f (Y ): if Y = y , then E f (Y )Y is deﬁned to be
E f (Y )Y = y = E f (y)Y = y = f (y).
Therefore,
Var(X Y ) = E(X2 Y ) − 2Z 2 + Z 2 = E(X2 Y ) − E(X Y )2 . 21. By the deﬁnition of variance,
N N Xi = E Var
i =1 2 Xi
i =1 N −E 2 Xi
i =1 , (42) 250 Chapter 10 More Expectations and Variances where by Wald’s equation,
N 2 Xi E = E (X)E(N) 2 = E (N) 2 2 · E (X) . (43) i =1 Now since N is independent of {X1 , X2 , . . . },
N 2 Xi E N 2 =E E i =1 Xi N i =1
∞ = N 2 E N = n P (N = n) Xi n=1 i =1 ∞ n = 2 E
n=1 i =1 ∞ = N = n P (N = n) Xi
n 2 E Xi n=1 P (N = n). i =1 Thus
N E 2 Xi ∞ = n n=1 i =1 Xi2 + 2 E Xi Xj P (N = n) i =1 i <j ∞ nE(X2 ) + 2 = E(Xi )E(Xj ) P (N = n) n=1 i <j
∞ ∞ nP (N = n) + = E(X )
2 n=1 = E(X2 )E(N ) + E (X) 2 2
n=1
∞ n
E (X)E(X)P (N = n) =
2 n(n − 1)P (N = n)
n=1 2 = E(X2 )E(N ) + E (X) E N (N − 1)
2 2 = E(X2 )E(N ) + E (X) E(N 2 ) − E (X) E(N ).
Putting this and (43) in (42), we obtain
N
2 2 Xi = E(X2 )E(N) + E (X) E(N 2 ) − E (X) E(N) − E (N ) Var 2 i =1 = E(N ) E (X2 ) − E (X) 2 + E (X) 2 E (N 2 ) − E (N ) 2 . E (X) 2 Section 10.5 Bivariate Normal Distribution 251 Therefore,
N Xi = E(N )Var(X) + [E(X)]2Var(N ). Var
i =1 10.5 BIVARIATE NORMAL DISTRIBUTION 1. The conditional probability density function of Y , given that X = 70 is normal with mean
E(Y  X = x) = µY + ρ σY
2.7
(x − µX ) = 60 + (0.45)
(70 − 71) = 59.595,
σX
3 and standard deviation
2
σY X=x = 2
(1 − ρ 2 )σY = 2.7 1 − (0.45)2 = 2.411. Therefore, the desired probability is
Y − 59.595
59 − 59.595
≥
X = 70
2.411
2.411
= 1 − (−0.25) = (0.25) = 0.5987. P (Y ≥ 59  X = 70) = P 2. By (10.24),
1
1
exp
x2 + y2 .
162π
162
(a) Since ρ = 0, X and Y are independent normal random variables with mean 0 and standard
deviation 9. Therefore,
f (x, y) = X−0
6
Y −0
12
≤
P
≤
9
9
12
9
(0.67) (1.23) = (0.7486)(0.9082) = 0.68. P X ≤ 6, Y ≤ 12 = P (X ≤ 6)P (Y ≤ 12) = P
= (b) To ﬁnd P X2 + Y 2 ≤ 36 , we use polar coordinates.
P X2 + Y 2 ≤ 36 = 1
162π exp − 1
x2 + y2
162 exp − 2r
12
r·
dr dθ.
162
162 d y dx x 2 +y 2 ≤36 = 1
2π 2π
0 6
0 Now let u = r 2 /162; du = (2r/162)dr and we get
P X2 + Y 2 ≤ 36 = 1
2π 2π
0 2/9
0 e−u du d θ = 1 − e−2/9 = 0.8. 252 Chapter 10 More Expectations and Variances 3. Note that
2
2
Var(αX + Y ) = α 2 σX + σY + 2αρ(X, Y )σX σY . Setting d
σY
Var(αX + Y ) = 0, we get α = −ρ(X, Y ) .
dα
σX 4. By (10.24), f (x, y) is maximum if and only if Q(x, y) is minimum. Let z1 =
z2 = y − µY
. Then ρ  ≤ 1 implies that
σY x − µX
and
σX 2
2
2
2
Q(x, y) = z1 − 2ρz1 z2 + z2 ≥ z1 − 2ρz1 z2  + z2
2
2
≥ z1 − 2z1 z2  + z2 = z1  − z2  2 ≥ 0. This inequality shows that Q is minimum if Q(x, y) = 0. This happens at x = µX and
y = µY . Therefore, (µX , µY ) is the point at which the maximum of f is obtained. 5. We have that
x fX (x) = 2 dy = 2x, 0 < x < 1, 0 1 fY (y) = 2 dx = 2(1 − y), 0 < y < 1, y fXY (x y) = 1
2
=
,
2(1 − y)
1−y fY X (y x) = 1
2
=,
2x
x y < x < 1. 0 < y < x. Therefore,
1 E(X  Y = y) = x y y x yfY X (y x) dy =
0 1
x
dy = ,
x
2 y x E(Y  X = x) = 1
1+y
dx =
,
1−y
2 1 xfXY (x y) dx = 0 0 < y < 1, 0 < x < 1. Now since E(Y  X = x) is a linear function of x and E(X  Y = y) is a linear function of y ,
by Lemma 10.3,
σY
x
µY + ρ (x − µX ) =
σX
2 Section 10.5 and
µX + ρ Bivariate Normal Distribution 253 σX
1+y
(y − µY ) =
.
σY
2 These relations imply that
ρ
Hence ρ > 0 and ρ 2 = ρ σY
1
=
σX
2 and ρ σX
1
=.
σY
2 σY
σX
1
·ρ
= . Therefore ρ = 1/2.
σX
σY
4 6. We use Theorem 8.8 to ﬁnd the joint probability density function of X and Y . The joint
probability density function of Z and W is given by
f (z, w) = 1
1
exp − z2 + w2 .
2π
2 Let h1 (z, w) = σ1 z + µ1 and h2 (z, w) = σ2 ρ z + 1 − ρ 2 w + µ2 . The system of equations
⎧
⎨σ1 z + µ1 = x
⎩σ ρ z +
2 1 − ρ 2 w + µ2 = y deﬁnes a onetoone transformation of R2 in the zw plane onto R2 in the xy plane. It has a
unique solution
z=
w= x − µ1
,
σ1
1
1− ρ2 y − µ2 ρ(x − µ1 )
−
σ2
σ1 for z and w in terms of x and y . Moreover,
1
σ1 J=
− 0 ρ 1 σ1 1 − ρ 2 σ2 1 − ρ 2 = 1
σ1 σ2 1 − ρ 2 = 0. Hence, by Theorem 8.8, the joint probability density function of X and Y is given by
1
σ1 σ2 1 − ρ 2 f x − µ1
,
σ1 1
1−ρ2 x − µ1
y − µ2
−ρ
σ2
σ1 . 1
1
Noting that f (z, w) =
exp − z2 + w2 . Straightforward calculations will result in
2π
2
(10.24), showing that the joint probability density function of X and Y is bivariate normal. 254 Chapter 10 More Expectations and Variances 7. Using Theorem 8.8, it is straightforward to show that the joint probability density function of
X + Y and X − Y is bivariate normal. Since
ρ(X + Y, X − Y ) = Var(X) − Var(Y )
Cov(X + Y, X − Y )
=
= 0,
σX+Y · σX−Y
σX+Y · σX−Y X + Y and X − Y are uncorrelated. But for bivariate normal, uncorrelated and independence
are equivalent. So X + Y and X − Y are independent. REVIEW PROBLEMS FOR CHAPTER 10
1. Number the last 10 graduates who will walk on the stage 1 through 10. Let Xi = 1 if the i th
graduate receives his or her own diploma; 0, otherwise. The number of graduates who will
receive their own diploma is X = X1 + X2 + · · · + Xn . Since
1
1
1
+0· 1−
=,
n
n
n E(Xi ) = 1 ·
we have E(X) = E(X1 ) + E(X2 ) + · · · + E(Xn ) = n · 1
= 1.
n 2. Since
2 E(X) =
1 5
(2x 2 − 2x) dx = ,
3 and
2 E(X 3 ) = (2x 4 − 2x 3 ) dx = 1 49
,
10 we have that
E(X3 + 2X − 7) = 37
49 10
+
−7=
.
10
3
30 3. Since
E(X 2 ) = 1 1
3 2 0 (3x 5 + x 3 y) dy dx = 0 1
,
2 and
E(XY ) =
we have that E(X 2 + 2XY ) = 1
3 1
0 2 (3x 4 y + x 2 y 2 ) dy dx = 0 1 188
511
+
=
.
2 135
270 94
,
135 Chapter 10 Review Problems 255 4. Let X1 , X2 , . . . , Xn be geometric random variables with parameters 1, (n − 1)/n, (n − 2)/n,
. . . , 1/n, respectively. The desired quantity is
n
n
+
+ ··· + n
E(X1 + X2 + · · · + Xn ) = 1 +
n−1 n−2
1
1
1
+
+ · · · + + 1 = 1 + nan−1 .
=1+n
n−1 n−2
2 5. Let X be the number of tosses until 4 consecutive sixes. Let Y be the number of tosses until
the ﬁrst nonsix outcome is obtained. We have
∞ E(X) = E E (XY ) = E(X  Y = i)P (Y = i)
i =1
∞ 4 = E(X  Y = i)P (Y = i) +
i =1
4 = E(X  Y = i)P (Y = i)
i =5 i + E(X)
i =1 1
6 i −1 5
+
6 ∞ 4
i =5 1
6 i −1 5
.
6 This equation reduces to
E(X) = 1 + E(X) 5
15
+ 2 + E(X) · + 3 + E(X)
6
66
+ 4 + E(X) 1
6
1
6 2 3 5
6
5
5 (1/6)4
+4
.
6
6 1 − (1/6) Solving this equation for E(X), we obtain E(X) = 1554. 6. f (x, y, z) = (2x)(2y)(2z), 0 < x < 1, 0 < y < 1, 0 < z < 1. Since 2x , 0 < x < 1
is a probability density function, 2y , 0 < y < 1 is a probability density function, and 2z,
0 < z < 1 is also a probability density function, these three functions are fX (x), fY (y), and
fZ (z), respectively. Therefore, f (x, y, z) = fX (x)fY (y)fZ (z) showing that X, Y , and Z are
independent. Thus
ρ(X, Y ) = ρ(Y, Z) = ρ(X, Z) = 0. 7. Since Cov(X, Y ) = σX σY ρ(X, Y ) = 2,
Var(3X − 5Y + 7) = Var(3X − 5Y ) = 9Var(X) + 25Var(Y ) − 15Cov(X, Y )
= 9 + 225 − 30 = 204. 8. Clearly,
pX (1) = p(1, 1) + p(1, 3) = 12/25, pX (2) = p(2, 3) = 13/25; pY (1) = p(1, 1) = 2/25, pY (3) = p(1, 3) + p(2, 3) = 23/25. 256 Chapter 10 More Expectations and Variances Therefore,
pX (x) = ⎧
⎨12/25 if x = 1 ⎩13/25 if x = 2, pY (y) = ⎧
⎨2/25 if y = 1 ⎩23/25 if y = 3. These yield
12
13
38
+2·
=;
25
25
25
2
23
71
E(Y ) = 1 ·
+3·
=;
25
25
25
1
1
1
22
E(XY ) = (1)(1) (12 + 12 ) + (1)(3) (12 + 32 ) + (2)(3) (22 + 32 ) =
.
25
25
25
5
E(X) = 1 · Thus
Cov(X, Y ) = E(XY ) − E(X)E(Y ) = 22 38 71
52
−
·
=
.
5
25 25
625 9. In Exercise 6, Section 8.1, we calculated p(x, y), pX (x), and pY (y). The results of that
exercise yield
12 E(X) = xpX (x) = 7;
x =2
5 E(Y ) = ypY (y) = 35/18;
y =0
12 5 E(XY ) = xyp(x, y) = 245/18.
x =2 y =0 Therefore,
Cov(X, Y ) = E(XY ) − E(X)E(Y ) = (245/18) − 7(35/18) = 0.
This shows that X and Y are uncorrelated. Note that X and Y are not independent as the
following shows.
1/36 = p(2, 0) = pX (2)pY (0) = (1/36)(6/36) = 1/216. 10. Let p be the probability mass function of X − Y , q be the probability mass function of X + Y ,
and r be the probability mass function of X2 − Y 2 . We have
x
p(x) 0
726/1296 1
520/1296 2
50/1296, Chapter 10 0
625/1296 x
q(x) x
r(x) 1
500/1296 0
726/1296 2
150/1296 1
500/1296 Review Problems 3
20/1296 3
20/1296 257 4
1/1296, 4
50/1296. Using these we obtain
760
,
1296
720
,
=
1296
= 0.831. E X2 − Y 2  =
E X − Y 2
σX+Y E X − Y  = 620
,
1296 E(X + Y ) = E (X + Y )2 = 1, 864
,
1296 σX−Y  = 0.572, Therefore,
ρ X − Y , X + Y  = Cov X − Y , X + Y
σX−Y  · σX+Y = E X2 − Y 2  − E X − Y  E (X + Y )
σX−Y  · σX+Y = (760/1296) (620/1296)(864/1296)
= 0.563.
(0.831)(0.572) 11. One way to solve this problem is to note that the desired probability is the area of the region
under the curve y = sin x from x = 0 to x = π/2 divided by the area of the rectangle
[0, π/2] × [0, 1]. Hence it is
π /2 sin x dx
0 π/2 = 2
.
π A second way to ﬁnd this probability is to note that (X, Y ) lies below the curve y = sin x if
and only if Y < sin X. Noting that f , the probability density function of X is given by
⎧
π
⎪2
⎨
if 0 < x <
2
f (x) = π
⎪
⎩0 otherwise,
and conditioning on X , we obtain
π /2 P (Y < sin X) = π /2 P (Y < sin X  X = x)f (x)dx = 0 2
= − cos x
π 0
π /2
0 2
=.
π sin x − 0 2
· dx
1−0
π 258 Chapter 10 More Expectations and Variances 12. (a) Clearly,
x fX (x) = e−x dy = xe−x , 0 < x < ∞, 0
∞ fY (y) = e−x dx = e−y , 0 < y < ∞. y (b) we have that
∞ E(X) = x 2 e−x = 2, 0 ye−y = 1, 0
∞ E X2 = ∞ E(Y ) = x 3 e−x dx = 6, ∞ E Y2 = 0 y 2 e−y = 2. 0 Therefore, Var(X) = 2 and Var(Y ) = 1. Also
∞ E(XY ) =
0 Thus
ρ(X, Y ) = ∞ e−x dx dy = 3. y 1
E(XY ) − E(X)E(Y )
3−2
=√ .
=√
σX σY
2·1
2 13. Let h(α, β) = E (Y − α − βX)2 . Then
h(α, β) = E Y 2 + α 2 + β 2 E X2 − 2αE(Y ) − 2βE(XY ) + 2αβE(X).
Setting ∂h
∂h
= 0 and
= 0, we obtain
∂α
∂β
α + E(X)β = E(Y )
E(X)α + E X2 β = E(XY ). Solving this system of two equations in two unknowns, we obtain
Cov(X, Y )
ρσX σY
σY
=
=ρ ,
2
2
σX
σX
σX
σY
α = µY − ρ µX .
σX β= Therefore, Y = µY + ρ σY
(X − µX ).
σX 14. We have that
∞ E(X) =
0 ∞
0 ∞ xye−y(1+x) dy dx =
0 x
1+x ∞
0 (1 + x)ye−y(1+x) dy d x. Chapter 10 Review Problems 259 ∞ Now 0 (1 + x)ye−y(1+x) dy is the expected value of an exponential random variable with
parameter 1 + x , so it is 1/(1 + x). Letting u = 1 + x , we have
∞ E(X) =
0 = ∞ 1 ∞
x
u−1
dx =
du
2
(1 + x)
u2
1
∞
∞
1
1
du −
du = ln u
− 1 = ∞.
u
u2
1
1 (b) To ﬁnd E(X Y ), note that
∞ E(X  Y = y) = ∞ xfXY (x y) dx = x 0 where ∞ fY (y) = 0
∞ ye−y(1+x) dx = e−y 0 f (x, y)
dx,
fY (y) ye−yx dx = e−y . 0 ∞ Note that 0 ye−yx dx = 1 because ye−yx is the probability density function of an exponential
random variable with parameter 1. So
∞ E(X  Y = y) = x
0 ye−y e−yx
dx =
e−y ∞ xye−xy dx = 0 1
,
y where the last equality holds because the last integral is the expected value of an exponential
random variable with parameter y . Since ∀y > 0, E(X  Y = y) = 1/y , E(XY ) = 1/Y. 15. Let X and Y denote the number of minutes past 10:00 A.M. that bus A and bus B arrive at
the station, respectively. X is uniformly distributed over (0, 30). Given that X = x , Y is
uniformly distributed over (0, x). Let f (x, y) be the joint probability density function of X
and Y . We calculate E(Y ) by conditioning on X :
E(Y ) = E E (Y X) = ∞
−∞ 30 E(Y  X = x)fX (x) dx =
0 x1
30
·
dx =
.
2 30
4 Thus the expected arrival time of bus B is 7.5 minutes past 10:00 A.M. 16. To ﬁnd the distribution function of N
i =1
∞ N Xi ≤ t = P Xi , note that
N n=1
∞ i =1 Xi ≤ t N = n P (N = n) P = i =1
n Xi ≤ t N = n P (N = n) P
n=1
∞ = i =1
n Xi ≤ t P (N = n), P
n=1 i =1 260 Chapter 10 More Expectations and Variances where the last inequality follows since N is independent of X1 , X2 , X3 , . . . . Now
is a gamma random variable with parameters n and λ. Thus
∞ N t Xi ≤ t = P 0 n=1
∞ i =1 λe−λx t = λpe
n=1 0 t = −λx λpe −λx 0 ∞
n=1 t = n
i =1 Xi (λx)n−1
dx (1 − p)n−1 p
(n − 1)!
λ(1 − p)x
(n − 1)! n−1 λ(1 − p)x
(n − 1)! n−1 dx dx λpe−λx eλ(1−p)x dx 0
t = λpe−λpx dx = 1 − e−λpt . 0 This shows that N
i =1 Xi is exponential with parameter λp. 17. Let X1 , X2 , . . . , Xi , . . . , X20 be geometric random variables with parameters 1, 19/20, . . . ,
20 − (i − 1) /20, . . . , 1/20. The desired quantity is
20 20 E(Xi ) = Xi = E
i =1 20 i =1 i =1 20
= 71.9548.
20 − (i − 1) Chapter 11 S ums o f I ndependent
R andom Variables
and L imit Theorems
11.1 MOMENTGENERATING FUNCTIONS
5 1. MX (t) = E etX = etx p(x) =
x =1 1t
e + e2t + e3t + e4t + e5t .
5 2. (a) For t = 0,
MX (t) = E etX = 3
−1 1 tx
1 e3t − e−t
e dx =
,
4
4
t whereas for t = 0, MX (0) = 1. Thus
⎧
⎪ 1 e3t − e−t
⎨
t
MX (t) = 4
⎪
⎩
1
Since X is uniform over (−1, 3), E(X) =
(b) By the deﬁnition of derivative, if t = 0
if t = 0. 3 − (−1)
−1 + 3
= 1 and Var(X) =
2
12 2 4
=.
3 MX (h) − MX (0)
1 e3h − e−h
= lim
−1
h→0
h→0 h
h
4h E(X) = MX (0) = lim e3h − e−h − 4h
3e3h + e−h − 4
9e3h − e−h
= lim
= lim
= 1,
h→0
h→0
h→0
4h2
8h
8 = lim where the ﬁfth and sixth equalities follow from L’Hôpital’s rule. 262 Chapter 11 Sums of Independent Random Variables and Limit Theorems 3. Note that
∞ MX (t) = E etX = etx · 2
x =1 1
3 ∞ x =2 ∞ etx · e−x ln 3 = 2 x =1 ex(t −ln 3) . x =1 Restricting the domain of MX (t) to the set t : t < ln 3 and using the geometric series
theorem, we get
MX (t) = 2 et −ln 3
2e t
.
=
3 − et
1 − et −ln 3 (Note that e− ln 3 = 1/3.) Differentiating MX (t), we obtain
6et MX (t) = 3 − et 2 , which gives E(X) = MX (0) = 3/2. 4. For t = 0, MX (0) = 1. For t = 0, using integration by parts, we obtain
1 MX (t) = 2xetx dx = 0 2e t
2
2e t
− 2 + 2.
t
t
t 5. (a) For t = 0, MX (0) = 1. For t = 0,
1 MX (t) =
0 =6 1 etx · 6x(1 − x) dx = 6 1 xetx dx − 6 0 x 2 etx dx
0 e
e
e
1
12(1 − et ) 6(1 + et )
2e
2e
2
− 2 + 2 −6
− 2+ 3−3 =
+
.
t
t
t
t
t
t
t
t3
t2
t t t t t (b) By the deﬁnition of derivative,
12(1 − et ) 6(1 + et )
+
−1
MX (t) − MX (0)
t3
t2
= lim
E(X) = MX (0) = lim
t →0
t →0
t
t
12(1 − et ) + 6t (1 + et ) − t 3
1
=,
4
t →0
t
2 = lim where the last equality is calculated by applying L’Hôpital’s rule four times. 6. Let A be the set of possible values of X. Clearly, MX (t) = x ∈A etx p(x), where p(x) is the Section 11.1 MomentGenerating Functions 263 probability mass function of X . Therefore,
MX (t) = xetx p(x),
x ∈A MX (t) = x 2 etx p(x),
x ∈A .
.
.
(n)
MX (t) = x n etx p(x).
x ∈A Therefore,
(n)
MX (0) = x n p(x) = E(Xn ).
x ∈A 7. (a) By deﬁnition,
∞ MX (t) = E e tX = e
x =0 tx e ∞ −λ x λ
(λet )x
= e−λ
= e−λ exp(λet ) = exp λ(et − 1) .
x!
x!
x =0 (b) From
MX (t) = λet exp λ(et − 1)
and
MX (t) = λet 2 exp λ(et − 1) + λet exp λ(et − 1) , we obtain E(X) = MX (0) = λ and E(X2 ) = MX (0) = λ2 + λ. Therefore,
Var(X) = (λ2 + λ) − λ2 = λ. 8. The probability density function of X is given by
f (x) = ⎧
⎪
⎨ 1
b−a ⎪
⎩0 if a < x < b
otherwise. Therefore, for t = 0,
b MX (t) = E etX =
a 1
1 tx
etb − eta
e dx =
,
b−a
b−a
t whereas for t = 0, MX (0) = 1. Thus
⎧
etb − eta
⎨1
MX (t) = b − a
t
⎩
1 if t = 0
if t = 0. 264 Chapter 11 Sums of Independent Random Variables and Limit Theorems 9. The probability mass function of a geometric random variable X, p(x) with parameter p is
given by p(x) = pq x −1 , Thus ∞ MX (t) = q = 1 − p,
pq x −1 etx = x =1
∞
x =1 x = 1, 2, 3, . . . .
p
q ∞ x q et .
x =1 tx q e converges to q et / 1 − qet if qet < 1
Now by the geometric series theorem,
or, equivalently, if t < − ln q. Restricting the domain of MX (t) to the set {t : t < − ln q }, we
obtain
∞
p
p
pet
qet
x
q et = ·
=
.
MX (t) =
q x =1
q 1 − qet
1 − qet
Now
MX (t) = pet
(1 − qet )2 MX (t) = and Therefore, p
1
=.
(1 − q)2
p E(X) = MX (0) =
and
E(X2 ) = MX (0) = pet + pqe2t
.
(1 − qet )3 p(1 + q)
1+q
=
.
3
(1 − q)
p2 Thus
Var(X) = E(X2 ) − E (X) 2 = 1+q
1
q
− 2 = 2.
2
p
p
p 10. Let X be a discrete random variable with the probability mass function p(x) = x/21, x =
1, 2, 3, 4, 5, 6. The momentgenerating function of X is the given function. 11. X is a discrete random variable with the set of possible values {1, 3, 4, 5} and probability mass
function
x
p(x) 1
5/15 3
4/15 4
2/15 5
4/15. 12. We have that
M2X+1 (t) = E e(2X+1)t = et E e2tX = et MX (2t) = 13. Note that
MX (t) = 24
,
(2 − t)4 MX (t) = et
,
1 − 2t 96
.
(2 − t)5 Therefore, 24
3
=,
16
2
and hence Var(X) = 3 − (9/4) = 3/4.
E(X) = MX (0) = t< E(X2 ) = MX (0) = 96
= 3,
32 1
.
2 Section 11.1 MomentGenerating Functions 265 (r)
(r)
14. Since for odd r ’s, MX (t) = (et − e−t )/6 and for even r ’s, MX (t) = (et + e−t )/6, we have that E(Xr ) = 0 if r is odd and E(Xr ) = 1/3 if r is even. 15. For a random variable X, we must have MX (0) = 1. Since t/(1 − t) is 0 at 0, it cannot be a
momentgenerating function. 16. (a) The distribution of X is binomial with parameters 7 and 1/4.
(b) The distribution of X is geometric with parameter 1/2.
(c) The distribution of X is gamma with parameters r and 2.
(d) The distribution of X is Poisson with parameter λ = 3. 17. Since 1t 24
e+
,
3
3
X is a binomial random variable with parameters 4 and 1/3; therefore,
MX (t) = 2 P (X ≤ 2) =
i =0 18. By relation (11.2), ∞ MX (t) =
n=0 4
i 1
3 2n n
t=
n! ∞
n=0 4−i 2
3 i 8
=.
9 (2t)n
= e 2t .
n! This shows that X = 2 with probability 1. 19. We know that for t = 0,
MX (t) = et − 1
et − 1
=
.
t (1 − 0)
t Therefore, for t = 0,
MaX+b (t) = E et (aX+b) = ebt E eatX = ebt MX (at)
= ebt · eat − 1
e(a +b)t − ebt
,
=
at
(a + b) − b t which is the momentgenerating function of a uniform random variable over (b, a + b). 20. Let µn = E(Z n ); then
∞ MX (t) =
n=0 Now et = ∞
n
n=0 (t /n!). e t 2 /2 n
MX (0) n
t=
n! ∞
n=0 µn n
t.
n! Therefore,
∞ =
n=0 (t 2 /2)n
=
n! ∞
n=0 t 2n
=
2n n! ∞
n=0 (2n)! t 2n
.
2n n! (2n)! (44) 266 Chapter 11 Sums of Independent Random Variables and Limit Theorems (2n)!
comparing this relation with (44), we obtain E(Z 2n+1 ) = 0, ∀n ≥ 0 and E(Z 2n ) = n ,
2 n!
∀n ≥ 1. 21. By deﬁnition,
MX (t) = ∞ λr
(r) etx x r −1 e−λx dx = 0 ∞ λr
(r) e(t −λ)x x r −1 dx. 0 This integral converges if t < λ. Therefore, if we restrict the range of MX (t) to t < λ, by the
substitution u = (λ − t)x , we obtain
MX (t) = ∞ λr
(r) 0 λr
(r)
λ
e−u ur −1
du =
·
=
r
r
(λ − t)
(r) (λ − t)
λ−t r . Now MX (t) = rλr (λ − t)−r −1 ; thus E(X) = MX (0) = r/λ. Also
MX (t) = r(r + 1)λr (λ − t)−r −2 ;
therefore, E(X2 ) = MX (0) = r (r + 1) /λ2 , and hence
Var(X) = r
r(r + 1)
−
λ2
λ 2 = r
.
λ2 22. (a) Let F be the distribution function of X. We have that
P (−X ≤ t) = P (X ≥ −t) = ∞ f (x) dx.
−t Letting u = −x and noting that f (−u) = f (u), we obtain
−∞ P (−X ≤ t) = t f (−u) (− du) = t −∞ f (u) du = F (t). This shows that the distribution function of −X is also F .
(b) Clearly,
MX (−t) = ∞ e−tx f (x) dx. −∞ Letting u = −x , we get
MX (−t) = ∞
−∞ etu f (−u) du = ∞
−∞ etu f (u) du = MX (t). A second way to explain this is to note that MX (−t) is the momentgenerating function of
−X . Since X and −X are identically distributed, we must have that MX (t) = MX (−t). Section 11.1 23. Note that ∞ MX (t) = E etX =
x =1 MomentGenerating Functions 6 tx
6
e=2
2x2
π
π ∞
x =1 267 etx
.
x2 Now by the ratio test,
et (x +1) /(x + 1)2
x2
= lim 2
et = et
x →∞
x →∞ x + 2x + 1
etx /x 2
lim ∞
x =1 which is > 1 for t ∈ (0, ∞). Therefore,
of the form (−δ, δ), δ > 0, MX (t) exists. etx
diverges on (0, ∞) and thus on no interval
x2 24. For t < 1/2, (11.2) implies that
∞ MX (t) =
n=0 = 1d
2 dt E(Xn ) n
t=
n!
∞ ∞ 1
(n + 1)(2t) =
2
n=0 (2t)n+1 = n=0 ∞ n 1d
·
2 dt 1/2
1
=
=
2
(1 − 2t)
(1/2) − t n=0 d
(2t)n+1
dt ∞ (2t)n − 1 =
n=0 1d
1
·
−1
2 dt 1 − 2t 2 . We see that for t < 1/2, MX (t) exists; furthermore, it is the momentgenerating function of a
gamma random variable with parameters r = 2 and λ = 1/2. 25. (a) At the end of the ﬁrst period, with probability 1, the investment will grow to
A+A X
X
=A 1+
;
k
k at the end of the second period, with probability 1, it will grow to
A 1+ X
X
X
X
+A 1+
·
=A 1+
k
k
k
k 2 ; Xn
.
k
Dividing a year into k equal periods allows the banks to compound interest quarterly,
monthly, or daily. If we increase k , we can compound interest every minute, second,
or even fraction of a second. For an inﬁnitesimal ε > 0, suppose that the interest
is compounded at the end of each period of length ε. If ε → 0, then the interest is
compounded continuously. Since a year is 1/ε periods, each of length ε, the interest
rate per period of length ε is the random variable X/(1/ε) = εX . Suppose that at time
t , the investment has grown to A(t). Then at t + ε, with probability 1, the investment
will be
A(t + ε) = A(t) + A(t) · εX. and, in general, at the end of the nth period, with probability 1, it will grow to A 1+
(b) 268 Chapter 11 Sums of Independent Random Variables and Limit Theorems This implies that
P A(t + ε) − A(t)
= XA(t) = 1.
ε Letting ε → 0, yields
A(t + ε) − A(t)
= XA(t) = 1
ε→0
ε P lim or, equivalently, with probability 1,
A (t) = XA(t).
(c) Part (b) implies that, with probability 1,
A (t)
= X.
A(t)
Integrating both sides of this equation, we obtain that, with probability 1,
ln[A(t)] = tX + C,
or
A(t) = etX+c .
Considering the fact that A(0) = A, this equation yields A = ec . Therefore, with
probability 1,
A(t) = etX · ec = AetX .
This shows that if the interest rate is compounded continuously, then an initial investment
of A dollars will grow, in t years, with probability 1, to the random variable AetX , whose
expected value is
E(AetX ) = AE(etX ) = AMX (t).
We have shown the following:
If money is invested in a bank at an annual rate X, where X is a random
variable, and if the bank compounds interest continuously, then, on average, the money will grow by a factor of MX (t), the momentgenerating
function of the interest rate. 26. Since Xi and Xj are binomial with parameters (n, pi ) and (n, pj ),
E(Xi ) = npi ,
σXi = npi (1 − pi ), E(Xj ) = npj ,
σXj = npj (1 − pj ). Section 11.2 Sums of Independent Random Variables To ﬁnd E(Xi Xj ), note that
M(t1 , t2 ) = E et1 Xi +t2 Xj
n n−xi = et1 xi +t2 xj P (Xi = xi , Xj = xj ) xi =0 xj =0
n n−xi = et1 xi +t2 xj · xi =0 xj =0
n n!
x
pixi pj j (1 − pi − pj )n−xi −xj
xi ! xj ! (n − xi − xj )! n−xi =
xi =0 xj n!
et1 pi
xi ! xj ! (n − xi − xj )!
=0 xi et2 pj xj (1 − pi − pj )n−xi −xj n = pi et1 + pj et2 + 1 − pi − pj ,
where the last equality follows from multinomial expansion (Theorem 2.6). Therefore,
∂ 2M
(t1 , t2 ) = n(n − 1)pi pj et1 et2 pi et1 + pj et2 + 1 − pi − pj
∂t1 ∂t2 n−2 , and so
E(Xi Xj ) =
Thus 11.2 ∂ 2M
(0, 0) = n(n − 1)pi pj .
∂t1 ∂t2 pi pj
n(n − 1)pi pj − (npi )(npj )
ρ(Xi , Xj ) = √
.
=−
(1 − pi )(1 − pj )
npi (1 − pi ) · npj (1 − pj ) SUMS OF INDEPENDENT RANDOM VARIABLES 1. MαX (t) = E etαX = MX (tα) = exp α µt + (1/2)α 2 σ 2 t 2 .
2. Since
MX1 +X2 +···+Xn (t) = MX1 (t)MX2 (t) · · · MXn (t) = p et
1 − (1 − p)et X1 + X2 + · · · + Xn is negative binomial with parameters (n, p). 3. Since
MX1 +X2 +···+Xn (t) = MX1 (t)MX2 (t) · · · MXn (t) =
X1 + X2 + · · · + Xn is gamma with parameters n and λ. λ
λ−t n , n , 269 270 Chapter 11 Sums of Independent Random Variables and Limit Theorems 4. For 1 ≤ i ≤ n, let Xi be negative binomial with parameters ri and p. We have that
M X1 +X2 +···+Xn (t) = MX1 (t)MX2 (t) · · · MXn (t)
= p et
1 − (1 − p)et r1 = p et
1 − (1 − p)et r1 +r2 +···+rn p et
1 − (1 − p)et r2 ··· p et
1 − (1 − p)et rn . Thus X1 + X2 + · · · + Xr is negative binomial with parameters r1 + r2 + · · · + rn and p . 5. Since
MX1 +X2 +···+Xn (t) = MX1 (t)MX2 (t) · · · MXn (t)
λ r1
λ
λ r2
=
···
λ−t
λ−t
λ−t
λ r1 +r2 +···+rn
=
,
λ−t rn X1 + X2 + · · · + Xn is gamma with parameters r1 + r2 + · · · + rn and λ. 6. By Theorem 11.4, the total number of underﬁlled bottles is binomial with parameters 180 and
0.15. Therefore, the desired probability is
180
(0.15)27 (0.85)153 = 0.083.
27 7. For j < i , P (X = i  X + Y = j ) = 0. For j ≥ i ,
P (X = i  X + Y = j ) = = P (X = i)P (Y = j − i)
P (X = i, Y = j − i)
=
P (X + Y = j )
P (X + Y = j ) ni
m
p (1 − p)n−i ·
p j −i (1 − p)m−(j −i)
i
j −i
=
n+m j
n+m−j
p (1 − p)
j n
i m
j −i
.
n+m
j Interpretation: Given that in n + m trials exactly j successes have occurred, the probability
mass function of the number of successes in the ﬁrst n trials is hypergeometric. This should
be intuitively clear. Section 11.2 Sums of Independent Random Variables 271 8. Since X + Y + Z is Poisson with parameter λ1 + λ2 + λ3 and X + Z is Poisson with parameter
λ1 + λ3 , we have that P (Y = y  X + Y + Z = t) = P (Y = y, X + Z = t − y)
P (X + Y + Z = t)
e−λ2 λ2 e−(λ1 +λ3 ) (λ1 + λ3 )t −y
·
y!
(t − y)!
y = e−(λ1 +λ2 +λ3 ) (λ1 + λ2 + λ3 )t
t!
t
y = λ2
λ1 + λ2 + λ3 y λ1 + λ3
λ1 + λ2 + λ3 t −y . 9. Let X be the remaining calling time of the person in the booth. Let Y be the calling time of the
person ahead of Mr. Watkins. By the memoryless property of exponential, X is exponential
with parameter 1/8. Since Y is also exponential with parameter 1/8, assuming that X and Y
are independent, the waiting time of Mr. Watkins, X + Y , is gamma with parameters 2 and
1/8. Therefore,
∞ P (X + Y ≥ 12) = 12 5
1 −x/8
xe
dx = e−3/2 = 0.558.
64
2 10. By Theorem 11.7, X + Y ∼ N(5, 9), X − Y ∼ N (−3, 9), and 3X + 4Y ∼ N (19, 130). Thus
P (X + Y > 0) = P 0−5
X+Y −5
>
=1−
3
3 P (X − Y < 2) = P (−1.67) = 2+3
X−Y +3
<
=
3
3 (1.67) = 0.9525, (1.67) = 0.9525, and
P (3X + 4Y > 20) = P 20 − 19
3X + 4Y − 19
>√
=1−
√
130
130 (0.9) = 0.4641. ¯
¯
11. Theorem 11.7 implies that X ∼ N(110, 1.6), where X is the average of the IQ’s of the
randomly selected students. Therefore,
¯
112 − 110
X − 110
¯
≥√
P ( X ≥ 112) = P
√
1.6
1.6 =1− (1.58) = 0.0571. ¯
¯
12. Let X1 be the average of the accounts selected at store 1 and X2 be the average of the accounts
selected at store 2. We have that
900
¯
= N(90, 90)
X1 ∼ N 90,
10 and 500
2500
¯
X2 ∼ N 100,
= N 100,
.
15
3 272 Chapter 11 Sums of Independent Random Variables and Limit Theorems ¯
¯
Therefore, X1 − X2 ∼ N − 10, 770
and so
3 ¯
¯
X1 − X2 + 10
0 + 10
¯
¯
¯
¯
P ( X1 > X2 ) = P ( X1 − X2 > 0) = P
>√
√
770/3
770/3
= 1 − (0.62) = 0.2676. 13. By Exercise 6, Section 10.5, X and Y are sums of independent standard normal random
variables. Hence αX + βY is a linear combination of independent standard normal random
variables. Thus, by Theorem 11.7, αX + βY is normal. 14. By Exercise 13, X − Y is normal; its mean is 71 − 60 = 11, its variance is
Var(X − Y ) = Var(X) + Var(Y ) − 2Cov(X, Y )
= Var(X) + Var(Y ) − 2ρ(X, Y )σX σY
= 9 + (2.7)2 − 2(0.45)(3)(2.7) = 9.
Therefore,
P (X − Y ≥ 8) = P 8 − 11
X − Y − 11
≥
=1−
3
3 (−1) = (1) = 0.8413. ¯
15. Let X be the average of the weights of the 12 randomly selected athletes. Let X1 , X2 , . . . ,
X12 be the weights of these athletes. Since
2 25
¯
X ∼ N 225,
12 = N 225, 625
,
12 we have that
2700
¯
¯
= P ( X ≤ 225)
P (X1 + X2 + · · · + X12 ≤ 2700) = P X ≤
12
¯
225 − 225
X − 225
≤√
=
=P √
625/12
625/12 (0) = 1
.
2 ¯
¯
16. Let X1 and X2 be the averages of the ﬁnal grades of the probability and calculus courses
Dr. Olwell teaches, respectively. We have that
418
448
¯
¯
X1 ∼ N 65,
= N (65, 19) and X2 ∼ N 72,
= N(72, 16).
22
28
¯
¯
Therefore, X1 − X2 ∼ N(−7, 35) and hence the desired probability is
¯
¯
¯
¯
¯
¯
P X1 − X2  ≥ 2 = P ( X1 − X2 ≥ 2) + P ( X1 − X2 ≤ −2)
=P
=1− ¯
¯
2+7
X1 − X2 + 7
≥√
√
35
35
(1.52) + +P ¯
¯
−2 + 7
X1 − X2 + 7
≤√
√
35
35 (0.85) = 1 − 0.9352 + 0.8023 = 0.8671. Section 11.2 Sums of Independent Random Variables 273 17. Let X and Y be the lifetimes of the mufﬂers of the ﬁrst and second cars, respectively.
(a) To calculate the desired probability, P (X − Y  ≥ 1.5), note that by symmetry,
P X − Y  ≥ 1.5 = 2P (X − Y ≥ 1.5).
Now X − Y ∼ N(0, 2), hence
P X − Y  ≥ 1.5 = 2P 1.5 − 0
X−Y −0
≥√
√
2
2 =2 1− (1.06) = 0.289. (b) Let Z be the lifetime of the ﬁrst mufﬂer the family buys. By symmetry, the desired
probability is
2P (Y > X + Z) = 2P (Y − X − Z > 0).
Now Y − X − Z ∼ N(−3, 3). Hence
2P (Y − X − Z > 0) = 2P Y −X−Z+3
0+3
>√
√
3
3 =2 1− (1.73) = 0.0836. 18. Let n be the maximum number of passengers who can use the elevator and X1 , X2 , . . . , Xn
be the weights of n random passengers. We must have
P (X1 + X2 + · · · Xn > 3000) < 0.0003
or, equivalently,
P (X1 + X2 + · · · + Xn ≤ 3000) > 0.9997.
¯
Let X be the mean of the weights of the n random passengers. We must have
3000
¯
P X≤
n
¯
Since X ∼ N 155, 625
, we must have
n
P or > 0.9997. ¯
(3000/n) − 155
X − 155
> 0.9997,
√≤
√
25/ n
25/ n
√
3000
155 n
> 0.9997.
√−
25
25 n Using Table 2 of the Appendix, this gives
√
3000
155 n
≥ 3.49
√−
25
25 n
or, equivalently, √
155n + 87.25 n − 3000 ≤ 0. 274 Chapter 11 Sums of Independent Random Variables and Limit Theorems √
Since the roots of the quadratic equation 155n + 87.25 n − 3000 = 0 are (approximately)
√
√
n = 4.127 and n = −4.69, the inequality is valid if and only if
√
√
n + 4.69
n − 4.127 ≤ 0.
√
√
But n + 4.69 > 0, so the inequality is valid if and only if n − 4.127 ≤ 0 or n ≤ 17.032.
Therefore the answer is n = 17. 19. By Remark 9.3, the marginal joint probability mass function of X1 , X2 , . . . , Xk is multinomial
with parameters n and (p1 , p2 , . . . , pk , 1 − p1 − p2 − · · · − pk ). Thus, letting p = p1 + p2 +
· · · + pk and x = x1 + x2 + · · · + xk , we have that
p(x1 , x2 , . . . , xk ) = n!
x
p x1 px2 · · · pk k (1 − p)n−x .
x1 ! x2 ! · · · xk ! (n − x)! 1 2 This gives
P (X1 + X2 + · · · + Xk = i)
=
x1 +x2 +···+xk =i = n!
x
xx
p11 p22 · · · pk k (1 − p)n−i
x1 ! x2 ! · · · xk ! (n − i)! n!
(1 − p)n−i
i ! (n − i)!
x 1 +x2 +···+xk =i i!
x
p x1 px2 · · · pk k
x1 ! x2 ! · · · xk ! 1 2 n
(1 − p)n−i (p1 + p2 + · · · + pk )i
i
ni
=
p (1 − p)n−i .
i
= This shows that X1 + X2 + · · · + Xk is binomial with parameters n and p = p1 + p2 + · · · + pk . 20. First note that if Y1 and Y2 are two exponential random variables each with rate λ, min(Y1 , Y2 )
is exponential with rate 2λ. Now let A1 , A2 , . . . , A11 be the customers in the line ahead of
Kim. Due to the memoryless property of exponential random variables, X1 , the time until
A1 ’s turn to make a call is exponential with rate 2(1/3) = 2/3. The time until A2 ’s turn to
call is X1 + X2 , where X2 is exponential with rate 2(1/3) = 2/3. Continuing this argument
and considering the fact that Kim is the 12th person waiting in the line, we have that the time
until Kim’s turn to make a phone call is X1 + X2 + · · · + X12 , where {X1 , X2 , . . . , X12 }
is an independent and identically distributed sequence of exponential random variables each
with rate 2/3. Hence the distribution of the waiting time of Kim is gamma with parameters
(12, 2/3). Her expected waiting time is 12(2/3) = 18. 11.3 MARKOV AND CHEBYSHEV INEQUALITIES 1. Let X be the lifetime (in months) of a randomly selected dollar bill. We are given that
E(X) = 22. By Markov inequality, Section 11.3 Markov and Chebyshev Inequalities 275 22
= 0.37.
60
This shows that at most 37% of the onedollar bills last 60 or more months; that is, at least
ﬁve years.
P (X ≥ 60) ≤ 2. We have that P (X ≥ 2) = 2/5. Hence, by Markov’s inequality,
E(X)
2
= P (X ≥ 2) ≤
.
5
2
This gives E(X) ≥ 4/5. 3. (a) P (X ≥ 11) ≤ E(X)
5
=
= 0.4545.
11
11 (b) P (X ≥ 11) = P (X − 5 ≥ 6) ≤ P X − 5 ≥ 6 ≤ 42 − 25
σ2
=
= 0.472.
36
36 4. Let X be the lifetime of the randomly selected light bulb; we have
P (X ≤ 700) ≤ P X − 800 ≥ 100 ≤ 2500
= 0.25.
10, 000 5. Let X be the number of accidents that will occur tomorrow. Then
(a) P (X ≥ 5) ≤ 2
= 0.4.
5
4 (b) P (X ≥ 5) = 1 −
i =0 e−2 2i
= 0.053.
i! (c) P (X ≥ 5) = P (X − 2 ≥ 3) ≤ P X − 2 ≥ 3 ≤ 2
= 0.222
9 6. Let X be the IQ of a randomly selected student from this campus; we have
15
= 0.017.
900
Therefore, less than 1.7% of these students have an IQ above 140.
P (X > 140) ≤ P X − 110 > 30 ≤ 7. Let X be the waiting period from the time Helen orders the book until she receives it. We want
to ﬁnd a so that P (X < a) ≥ 0.95 or, equivalently, P (X ≥ a) ≤ 0.05. But
P (X ≥ a) = P (X − 7 ≥ a − 7) ≤ P X − 7 ≥ a − 7 ≤ 4
.
(a − 7)2 So we should determine the value of a for which 4/(a − 7)2 ≤ 0.05; it is easily seen that
a ≥ 15.9 or a = 16. Therefore, Helen should order the book 16 days earlier. 276 Chapter 11 Sums of Independent Random Variables and Limit Theorems 8. By Markov’s inequality, P (X ≥ 2µ) ≤ µ
1
=.
2µ
2 9. P (X > 2µ) = P (X − µ > µ) ≤ P X − µ ≥ µ ≤ µ
1
=.
µ2
µ 10. We have that
¯
¯
¯
P (38 < X < 46) = P (−4 < X − 42 < 4) = P X − 42 < 4
¯
= 1 − P X − 42 ≥ 4 .
By (11.3),
¯
P X − 42 ≥ 4 ≤
Hence 3
60
=
.
16(25)
20 17
3
¯
=
= 0.85.
P (38 < X < 46) ≥ 1 −
20
20 11. For i = 1, 2, . . . , n, let Xi be the IQ of the i th student selected at random. We want to ﬁnd n,
so that
P −3<
or, equivalently, X1 + X2 + · · · + Xn
− µ < 3 ≥ 0.92
n
¯
P (X − µ ≥ 3) ≤ 0.08. Since E(Xi ) = µ and Var(Xi ) = 150, by (11.3),
150
¯
P (X − µ ≥ 3) ≤ 2 .
3 ·n
Therefore, all we need to do is to ﬁnd n for which 150/(9n) ≤ 0.08. This gives n ≥
150/[9(0.08)] = 208.33. Thus the psychologist should choose a sample of size 209. 12. Let X1 , X2 , . . . , Xn be the random sample, µ be the expected value of the distribution, and σ 2
be the variance of the distribution. We want to ﬁnd n so that
¯
P (X − µ < 2σ ) ≥ 0.98
or, equivalently, ¯
P (X − µ ≥ 2σ ) < 0.02. By (11.3),
1
σ2
=
.
(2σ )2 · n
4n
Therefore, all we need to do is to make sure that 1/(4n) ≤ 0.02. This gives n ≥ 12.5. So a
sample of size 13 gives a mean which is within 2 standard deviations from the expected value
with a probability of at least 0.98.
¯
P (X − µ ≥ 2σ ) ≤ Section 11.3 Markov and Chebyshev Inequalities 277 13. Call a random observation success, if the operator is busy. Call it failure, if he is free. In
(11.5), let ε = 0.05 and α = 0.04; we have
n≥ 1
= 2500.
4(0.05)2 (0.04) Therefore, at least 2500 independent observations should be made to ensure that (1/n) n=1
i
estimates p , the proportion of time that the airline operator is busy, with a maximum error of
0.05 with probability 0.96 or higher. 14. By (11.5),
1
= 1666.67.
4(0.05)2 (0.06)
Therefore, it sufﬁces to ﬂip the coin n = 1667 times independently.
n≥ 15. P X − µ ≥ α = P X − µ2n ≥ α 2n ≤ E (X − µ)2n
.
α 2n 16. By Markov’s inequality, P (X > t) = P ekX > ekt ≤ E ekX
.
ekt 17. By the Corollary of CauchySchwarz Inequality (Theorem 10.3),
E (X − Y ) 2 ≤ E (X − Y )2 = 0. This gives that E(X − Y ) = 0. Therefore,
Var(X − Y ) = E (X − Y )2 − E (X − Y ) 2 = 0. We have shown that X − Y is a random variable with mean 0 and variance 0; by Example 11.16,
P (X − Y = 0) = 1. So with probability 1, X = Y . 18. If Y = X with probability 1, Theorem 10.5 implies that ρ(X, Y ) = 1. Suppose that ρ(X, Y ) = 1; we show that X=Y with probability 1. Note that E(X) = E(Y ) = (n + 1)/2, Var(X) =
Var(Y ) = (n2 − 1)/12, and σX = σY = (n2 − 1)/12. These and
1 = ρ(X, Y ) = E(XY ) − E(X)E(Y )
σX σY imply that E(XY ) = (2n2 + 3n + 1)/6. Therefore,
E (X − Y )2 = E(X2 − 2XY + Y 2 ) = E(X2 ) + E(Y 2 ) − 2E(XY )
= Var(X) + E (X)
= n+1
n2 − 1
+
12
2 2 + Var(Y ) + E (Y )
2 + 2 n+1
n2 − 1
+
12
2 − 2E(XY )
2 − 2n2 + 3n + 1
= 0.
3 E (X − Y )2 = 0 implies that with probability 1, X=Y (see Exercise 17 above). 278 Chapter 11 Sums of Independent Random Variables and Limit Theorems 19. By Markov’s inequality,
P X≥ E etX
1
1
ln α = P (tX ≥ ln α) = P etX ≥ α ≤
= MX (t).
t
α
α 20. Using gamma function introduced in Section 7.4,
E(X) = E(X2 ) =
Hence 2
σX 1
n!
1
n! ∞ x n+1 e−x dx = (n + 1)!
(n + 2)
=
= n + 1,
n!
n! x n+2 e−x dx = (n + 2)!
(n + 3)
=
= (n + 1)(n + 2).
n!
n! 0
∞
0 = (n + 1)(n + 2) − (n + 1)2 = n + 1. Now
P (0 < X < 2n + 2) = 1 − P (X ≥ 2n + 2), and by Chebyshev’s inequality,
P (X ≥ 2n + 2) = P X − (n + 1) ≥ n + 1 ≤ P X − (n + 1) ≥ n + 1
n+1
1
≤
=
.
(n + 1)2
n+1
Therefore,
P (0 < X < 2 n + 1 ) ≥ 1 − 11.4 1
n
=
.
n+1
n+1 LAWS OF LARGE NUMBERS 1. Since 1 E(Xi ) =
0 1
x · 4x(1 − x) dx = ,
3 by the strong law of large numbers,
P X1 + X2 + · · · + Xn
1
=
= 1.
n→∞
n
3
lim 2. If X1 > M with probability 1, then X2 > M with probability 1 since X1 and X2 are identically
distributed. Therefore, X1 + X2 > 2M > M with probability 1. This argument shows that
{X1 > M } ⊆ {X1 + X2 > M } ⊆ {X1 + X2 + X3 > M } ⊆ · · · .
Therefore, by the continuity of probability function (Theorem 1.8),
lim P (X1 + X2 + · · · + Xn > M) = P lim X1 + X2 + · · · + Xn > M . n→∞ n→∞ Section 11.4 Laws of Large Numbers 279 By this relation, it sufﬁces to show that ∀M > 0,
lim X1 + X2 + · · · + Xn > M (45) n→∞ with probability 1. Let S be the sample space over which Xi ’s are deﬁned. Let µ = E(Xi );
we are given that µ > 0. By the central limit theorem,
lim P n→∞ X1 + X2 + · · · Xn
= µ = 1.
n Therefore, letting
V = ω ∈ S : lim n→∞ X1 (ω) + X2 (ω) + · · · Xn (ω)
=µ ,
n we have that P (V ) = 1. To establish (45), it is sufﬁcient to show that ∀ω ∈ V ,
lim X1 (ω) + X2 (ω) + · · · Xn (ω) = ∞. (46) n→∞ To do so, applying the deﬁnition of limit to
lim n→∞ X1 (ω) + X2 (ω) + · · · Xn (ω)
= µ,
n we have that for ε = µ/2, there exists a positive integer N (depending on ω) such that ∀n > N ,
X1 (ω) + X2 (ω) + · · · Xn (ω)
µ
−µ <ε =
n
2
or, equivalently,
−
This yields µ
X1 (ω) + X2 (ω) + · · · Xn (ω)
µ
<
−µ< .
2
n
2
µ
X1 (ω) + X2 (ω) + · · · Xn (ω)
>.
n
2 Thus, for all n > N ,
X1 (ω) + X2 (ω) + · · · Xn (ω) > nµ
,
2 which establishes (46). 3. For 0 < ε < 1,
n P Yn − 0 > ε = 1 − P Yn − 0 ≤ ε = 1 − P (X ≤ n) = 1 − f (x) dx.
0 Therefore, ∞ lim P Yn − 0 > ε = 1 − n→∞ showing that Yn converges to 0 in probability. 0 f (x) dx = 1 − 1 = 0, 280 Chapter 11 Sums of Independent Random Variables and Limit Theorems 4. By the strong law of large numbers, Sn /n converges to µ almost surely. Therefore, Sn /n
converges to µ in probability and hence
lim P n(µ − ε) ≤ Sn ≤ n(µ + ε) = lim P µ − ε ≤ n→∞ n→∞ Sn
≤µ+ε
n Sn
−µ ≤ε
n→∞
n
Sn
− µ > ε = 1 − 0 = 1.
= 1 − lim P
n→∞
n = lim P 5. Suppose that the bank will never be empty of customers again. We will show a contradiction.
Let Un = T1 + T2 + · · · + Tn . Then Un is the time the nth new customer arrives. Let Wi be the
service time of the i th new customer served. Clearly, W1 , W2 , W3 , . . . are independent and
identically distributed random variables with E(Wi ) = 1/µ. Let Zn = T1 +W1 +W2 +· · ·+Wn .
Since the bank will never be empty of customers, Zn is the departure time of the nth new
customer served. By the strong law of large numbers,
lim n→∞ 1
Un
=
n
λ and
lim n→∞ T1 W1 + W2 + · · · + Wn
Zn
= lim
+
n→∞ n
n
n
T1
W1 + W2 + · · · + Wn
1
1
+ lim
=0+ = .
= lim
n→∞ n
n→∞
n
µ
µ Clearly, the bank will never remain empty of customers again if and only if ∀n,
Un+1 < Zn .
This implies that
Un+1
Zn
<
n
n
or, equivalently, Zn
n + 1 Un+1
·
<
.
n
n+1
n Thus
lim n→∞ n + 1 Un+1
Zn
·
≤ lim
n→∞ n
n
n+1 (47) n+1
Un+1
Zn
1
1
= 1, and with probability 1, lim
=
and lim
= , (47)
n→∞
n→∞ n + 1
n→∞ n
n
λ
µ
1
1
implies that ≤
or λ ≥ µ. This is a contradiction to the fact that λ < µ. Hence, with
λ
µ
probability 1, eventually, for some period, the bank will be empty of customers again.
Since lim Section 11.4 Laws of Large Numbers 281 6. Suppose that the bank will never be empty of customers again. We will show a contradiction.
Let Un = T1 + T2 + · · · + Tn . Then Un is the time the nth new customer arrives. Let R be the
sum of the remaining service time of the customer being served and the sums of the service
times of the m customers present in the queue at t = 0. Let Zn = R + S1 + S2 + · · · + Sn .
Since the bank will never be empty of customers, and customers are served on a ﬁrstcome,
ﬁrstserved basis, we have that U1 < R and hence Zn is the departure time of the nth new
customer. By the strong law of large numbers,
Un
1
=
n→∞ n
λ
lim and
lim n→∞ Zn
R S1 + S2 + · · · + Sn
= lim
+
n→∞ n
n
n
1
1
R
S1 + S2 + · · · + Sn
+ lim
=0+ = .
= lim
n→∞ n
n→∞
n
µ
µ Clearly, the bank will never remain empty of customers if and only if ∀n,
Un+1 < Zn .
This implies that
Un+1
Zn
<
n
n
or, equivalently, n + 1 Un+1
Zn
·
<
.
n
n+1
n Thus
lim n→∞ n + 1 Un+1
Zn
·
≤ lim
n
n + 1 n→∞ n (48) n+1
Un+1
Zn
1
1
= 1, and with probability 1, lim
=
and lim
= , (48)
n→∞ n + 1
n→∞ n
n
λ
µ
1
1
implies that ≤
or λ ≥ µ. This is a contradiction to the fact that λ < µ. Hence, with
λ
µ
probability 1, eventually, for some period, the bank will be empty of customers.
Since lim n→∞ 7. Xn converges to 0 in probability because for every ε > 0, P Xn − 0 ≥ ε is the probability i i+1
that the random point selected from [0, 1] is in k , k . Now n → ∞ implies that 2k → ∞
2
2
i i+1
and the length of the interval k , k
→ 0, Therefore, limn→∞ P Xn − 0 ≥ ε = 0.
2
2
However, Xn does not converge at any point because for all positive natural number N , there
are always m > N and n > N , such that Xm = 0 and Xn = 1 making it impossible for
Xn − Xm  Section 12.4 ContinuousTime Markov Chains 323 16. It is straightforward to see that
π(i,j ) = λ
µ1 i 1− λ
µ1 λ
µ2 j 1− λ
,
µ2 i, j ≥ 0, satisfy the following balance equations for the tandem queueing system under consideration.
Hence, by Example 12.43, π(i,j ) is the product of an M/M/1 system having i customers in
the system, and another M/M/1 queueing system having j customers in the system. This
establishes what we wanted to show.
State
(0, 0)
(i, 0), i ≥ 1
(0, j ), j ≥ 1
(i, j ), i, j ≥ 1 Input rate to = Output rate from µ2 π(0,1)
µ2 π(i,1) + λπ(i −1,0)
µ2 π(0,j +1) + µ1 π(1,j −1)
µ2 π(i,j +1) + µ1 π(i +1,j −1) + λπ(i −1,j ) =
=
=
= λπ(0,0)
λπ(i,0) + µ1 π(i,0)
λπ(0,j ) + µ2 π(0,j )
λπ(i,j ) + µ1 π(i,j ) + µ2 π(i,j ) . 17. Clearly, X (t) : t ≥ 0 is a birth and death process with birth rates λi = iλ, i ≥ 0, and death rates µi = iµ + γ , i > 0; µ0 = 0. For some m ≥ 1, suppose that X(t) = m. Then, for
inﬁnitesimal values of h, by (12.5), the population at t + h is m + 1 with probability mλh + o(h),
it is m − 1 with probability (mµ + γ )h + o(h), and it is still m with probability
1 − mλh − o(h) − (mµ + γ )h − o(h) = 1 − (mλ + mµ + γ )h + o(h). Therefore,
E X (t + h)  X(t) = m = (m + 1) mλh + o(h) + (m − 1) (mµ + γ )h + o(h)
+ m 1 − (mλ + mµ + γ )h + o(h)
= m + m(λ − µ) − γ h + o(h).
This relation implies that
E X (t + h)  X(t) = X(t) + (λ − µ)X(t) − γ h + o(h).
Equating the expected values of both sides, and noting that
E E X (t + h)  X(t) = E X (t + h) , we obtain
E X (t + h) = E X (t) + h(λ − µ)E X (t) − γ h + o(h).
For simplicity, let g(t) = E X (t) . We have shown that
g(t + h) = g(t) + h(λ − µ)g(t) − γ h + o(h) 324 Chapter 12 Stochastic Processes or, equivalently, g(t + h) − g(t)
o(h)
= (λ − µ)g(t) − γ +
.
h
h As h → 0, this gives
g (t) = (λ − µ)g(t) − γ .
If λ = µ, then g (t) = −γ . So g(t) = −γ t + c. Since g(0) = n, we must have c = n, or
g(t) = −γ t + n. If λ = µ, to solve the ﬁrst order linear differential equation,
g (t) = (λ − µ)g(t) − γ ,
let f (t) = (λ − µ)g(t) − γ . Then
1
f (t) = f (t),
λ−µ
or f (t)
= λ − µ.
f (t) This yields
ln f (t) = (λ − µ)t + c,
or
f (t) = e(λ−µ)t +c = Ke(λ−µ)t ,
where K = ec . Thus
g(t) = K (λ−µ)t
γ
e
+
.
λ−µ
λ−µ Now g(0) = n implies that K = n(γ − µ) − γ . Thus
g(t) = E X (t) = ne(λ−µ)t + γ
1 − e(λ−µ)t .
λ−µ 18. For n ≥ 0, let En be the event that, starting from state n, eventually extinction will occur. Let
αn = P (En ). Clearly, α0 = 1. We will show that αn = 1, for all n. For n ≥ 1, starting from
n, let Zn be the state to which the process will move. Then Zn is a discrete random variable
with set of possible values {n − 1, n + 1}. Conditioning on Zn yields
P (En ) = P (En  Zn = n − 1)P (Zn = n − 1) + P (En  Zn = n + 1)P (Zn = n + 1).
Hence
αn = αn−1 · µn
λn
+ αn+1 ·
,
λn + µn
λn + µn n ≥ 1, or, equivalently,
λn (αn+1 − αn ) = µn (αn − αn−1 ), n ≥ 1. Section 12.4 ContinuousTime Markov Chains 325 For n ≥ 0, let yn = αn+1 − αn . We have
λn yn = µn yn−1 ,
or
yn = µn
yn−1 ,
λn n ≥ 1,
n ≥ 1. Therefore,
µ1
y0
λ1
µ2
µ1 µ2
y2 =
y1 =
y0
λ2
λ1 λ2
y1 = .
.
.
yn = µ1 µ2 · · · µn
y0 .
λ1 λ2 · · · λn n ≥ 1. On the other hand, by yn = αn+1 − αn , n ≥ 0,
α1 = α0 + y0 = 1 + y0
α2 = α1 + y1 = 1 + y0 + y1
.
.
.
αn+1 = 1 + y0 + y1 + · · · + yn .
Hence
n αn+1 = 1 + y0 + yk
k =1
n = 1 + y0 + y0
k =1
n = 1 + y0 1 +
k =1 µ1 µ2 · · · µk
λ1 λ2 · · · λk
µ1 µ2 · · · µk
λ1 λ2 · · · λk
n = 1 + (α1 − 1) 1 +
k =1
∞ n µ1 µ2 · · · µk
.
λ1 λ2 · · · λk µ1 µ2 · · · µk
µ1 µ2 · · · µk
= ∞, the sequence
increases without bound. For
λ1 λ2 · · · λk
λ1 λ2 · · · λk
k =1
k =1
αn ’s to exist, this requires that α1 = 1, which in turn implies that αn+1 = 1, for n ≥ 1. Since 326 12.5 Chapter 12 Stochastic Processes BROWNIAN MOTION 1. (a) By the independentincrements property of Brownian motions, the desired probability
is
P − 1/2 < Z(10) < 1/2  Z(5) = 0
= P − 1/2 < Z(10) − Z(5) < 1/2  Z(5) = 0
= P − 1/2 < Z(10) − Z(5) < 1/2 .
Since Z(10) − Z(5) is normal with mean 0 and variance (10 − 5)σ 2 = 45, letting
Z ∼ N(0, 1), we have
P − 1/2 < Z(10) − Z(5) < 1/2
=P −0.5 − 0
0.5 − 0
<Z< √
√
45
45 ≈ P (−0.07 < Z < 0.07) =
(b) (0.07) − (−0.07) = 0.056. In Theorem 12.9, let t1 = 5, t2 = 7, z1 = 0, z2 = −1. We have
E Z (6)  Z(5) = 0 and Z(7) = −1 = 0 + −1 − 0
(6 − 5) = −0.5,
7−5 Var Z (6)  Z(5) = 0 and Z(7) = −1 = 9 · (7 − 6)(6 − 5)
= 4.5.
7−5 2. In the subsection of 12.5, The Maximum of a Brownian Motion, we have shown that
P max X(s) ≤ u =
0≤s ≤t ⎧
⎨2
⎩0 u
√ −1 u≥0
σt
u < 0. We will show that X(t) has the same probability distribution function. To do so, note that
√
X(t) ∼ N(0, σ 2 t) and X(t)/(σ t) is standard normal. Thus, for u ≥ 0,
P X(t) ≤ u = P − u ≤ X(t) ≤ u = P X (t) ≤ u − P X (t) < −u
u
u
=P Z ≤ √ −P Z <− √
σt
σt
u
u
u
=2
=
√ − 1−
√
√ − 1.
σt
σt
σt
For u < 0, P X(t) ≤ u = 0. Hence max X(s) and X(t) are identically distributed.
0≤s ≤t Section 12.5 Brownian Motion 327 3. Let Z ∼ N(0, 1). Since X(t) ∼ N(0, σ 2 t), we have
P X(t)
> ε = P X(t) > εt
t
= P X (t) > εt + P X (t) < −εt
εt
√
σt
√
εt
=P Z>
σ
√
= 1 − ε t/σ
√
= 1 − ε t/σ
=P Z> This implies that εt
+P Z <− √
σt
√
εt
+P Z <−
σ
√
+
− ε t/σ
√
+ 1 − ε t/σ = 2 − 2 lim P X(t)
> ε = 2 − 1 = 1.
t lim P √
ε t/σ . X(t)
> ε = 2 − 2 = 0,
t t →0 whereas
t →∞ 4. Let F be the probability distribution function of 1/Y 2 . Let Z ∼ N (0, 1). We have √
√
F (t) = P 1/Y 2 ≤ t = P Y 2 ≥ 1/t = P Y ≥ 1/ t + P Y ≤ −1/ t
α
α
=P Z ≥ √ +P Z ≤− √
σt
σt
α
α
α
− √ =2 1−
=1−
√+
√,
σt
σt
σt which, by (12.35), is also the distribution function of Tα . 5. Clearly, P (T < x) = 0 if x ≤ t . For x > t , by Theorem 12.10,
P (T < x) = P at least one zero in (t, x) = 2
arccos
π t
.
x Let F be the distribution function of T . We have shown that
⎧
⎪0
x≤t
⎨
F (x) = 2
⎪ arccos t
⎩
x ≥ t.
π
x 6. Rewrite X(t1 ) + X(t2 ) as X(t1 ) + X(t2 ) = 2X(t1 ) + X(t2 ) − X(t1 ). Now 2X(t1 ) and X(t2 ) −
X(t1 ) are independent random variables. By Theorem 11.7, 2X(t1 ) ∼ N(0, 4σ 2 t1 ). Since
X(t2 ) − X(t1 ) ∼ N 0, σ 2 (t2 − t1 ) , applying Theorem 11.7 once more implies that
2X(t1 ) + X(t2 ) − X(t1 ) ∼ N 0, 4σ 2 t1 + σ 2 (t2 − t1 ) . 328 Chapter 12 Stochastic Processes Hence X(t1 ) + X(t2 ) ∼ N(0, 3σ 2 t1 + σ 2 t2 ). 7. Let f (x, y) be the joint probability density function of X(t) and X(t + u). Let fX(t +u)X(t) (y a) be the conditional probability density function of X(t + u) given that X(t) = a . Let fX(t) (x)
be the probability density function of X(t). We know that X(t) is normal with mean 0 and
variance σ 2 t . The formula for f (x, y) is given by (12.28). Using these, we obtain fX(t +u)X(t) (y a) = = f (a, y)
=
fX(t) (a) 1
1 a 2 (y − a)2
+
√ exp −
2σ 2 t
u
2σ 2 π tu
1
a2
exp −
√
2σ 2 t
σ 2π t 1
1
(y − a)2 .
exp −
√
2σ 2 u
σ 2π u This shows that the conditional probability density function of X(t + u) given that X(t) = a
is normal with mean a and variance σ 2 u. Hence
E X (t + u)  X(t) = a = a.
This implies that
E X (t + u)  X(t) = X(t). 8. By Example 10.23,
E X (t)X(t + u)  X(t) = X(t)E X (t + u)  X(t) .
By Exercise 7 above,
E X (t + u)  X(t) = X(t).
Hence
E X (t)X(t + u) = E E X (t)X(t + u)  X(t)
= E X (t)E X (t + u)  X(t)
= E X (t) · X(t) = E X (t)2
= Var X (t) + E X (t) 2 = σ 2 t + 0 = σ 2 t. 9. For t > 0, the probability density function of Z(t) is
φt (x) = 1
x2
.
exp −
√
2σ 2 t
σ 2π t Section 12.5 Brownian Motion 329 Therefore,
∞ E V (t) = E Z(t) =
∞ =2 −∞ x φt (x) dx 0 2t
π x
2
2
e−x /(2σ t) dx.
√
σ 2π t 0 Making the change of variable u = E V (t) = σ ∞ xφt (x) dx = 2 ∞ x
√ yields
σt
ue−u 2 /2 du = σ 0 Var V (t) = E V (t)2 − E V (t)
= σ 2t − 2t
2
− e−u /2
π 2 ∞
0 = E Z (t)2 − =σ 2t
.
π 2σ 2 t
π 2σ 2 t
2
= σ 2t 1 −
,
π
π since
E Z (t)2 = Var Z (t) + E Z (t) 2 = σ 2 t + 0 = σ 2 t. To ﬁnd P V (t) ≤ z  V (0) = z0 , note that, by (12.27),
P V (t) ≤ z  V (0) = z0 = P Z(t) ≤ z  V (0) = z0
= P − z ≤ Z(t) ≤ z  V (0) = z0
z = −z 1
2
2
e−(u−z0 ) /(2σ t) du.
√
σ 2π t Letting U ∼ N(z0 , σ 2 t) and Z ∼ N(0, 1), this implies that
P V (t) ≤ z  V (0) = z0 = P (−z ≤ U ≤ z)
=P
= z − z0
−
√
σt −z − z0
√
σt = 10. Clearly, D(t) = z − z0
−z − z0
√ ≤z≤ √
σt
σt z + z0
+
√
σt z − z0
− 1.
√
σt X (t)2 + Y (t)2 + Z(t)2 . Since X(t), Y (t), and Z(t) are independent and 330 Chapter 12 Stochastic Processes identically distributed normal random variables with mean 0 and variance σ 2 t , we have
E D (t) = ∞ ∞ ∞ −∞ −∞ −∞ 1
1
2
2
2
2
x 2 + y 2 + z2 · √
e−x /(2σ t) · √
e−y /(2σ t)
σ 2π t
σ 2π t
1
2
2
·√
e−z /(2σ t) dx dy dz
σ 2π t = 1
√
3 t 2π t
2π σ ∞ ∞ ∞ −∞ −∞ −∞ x 2 + y 2 + z2 · e−(x 2 +y 2 +z2 )/(2σ 2 t) dx dy dz. We now make a change of variables to spherical coordinates: x = ρ sin φ cos θ , y =
ρ sin φ sin θ , z = ρ cos φ , ρ 2 = x 2 + y 2 + z2 , dx dy dz = ρ 2 sin φ dρ dφ dθ, 0 ≤ ρ < ∞,
0 ≤ φ ≤ π , and 0 ≤ θ ≤ 2π . We obtain
E D (t) =
=
=
= 11. Noting that 1
√
2π σ 3 t 2π t
1
√
2π σ 3 t 2π t
1
√
3 t 2π t
2π σ 2π
0 ∞ π
0 ρe−ρ
∞ π
0 ρ 3 e−ρ 2 /(2σ 2 t) dρ sin φ dφ d θ 0 2π π 0 · ρ 2 sin φ dρ dφ, dθ 0 2π
0 2 /(2σ 2 t) − σ 2 t (ρ 2 + 2 2 t)e−ρ 2 /(2σ 2 t) 0 0 1
· 2σ 4 t 2
√
2π σ 3 t 2π t 2π
0 π sin φ dφ d θ = 2σ
0 ∞ sin φ dφ d θ 2t
.
π √
5.29 = 2.3, we have
V (t) = 95e−2t +2.3W (t) , where W (t) : t ≥ 0 is a standard Brownian motion. Hence W (t) ∼ N (0, t). The desired
probability is
P V (0.75) < 80 = P 95e−2(0.75)+2.3W (0.75) < 80
= P e2.3W (0.75) < 3.774 = P W (0.75) < 0.577
=P W (0.75) − 0
0.577
<√
= P (Z < 0.67) =
√
0.75
0.75 (0.67) = 0.7486. Chapter 12 Review Problems 331 REVIEW PROBLEMS FOR CHAPTER 12
1. Label the time point 10:00 as t = 0. We are given that N (180) = 10 and are interested in P S10 ≥ 160  N(180) = 10 . Let X1 , X2 , . . . , X10 be 10 independent random variables uniformly distributed over the interval [0, 180]. Let Y = max(X1 , . . . , X10 ). By Theorem 12.4,
P S10 > 160  N(180) = 10 = P (Y > 160) = 1 − P (Y ≤ 160)
= 1 − P max(X1 , . . . , X10 ) ≤ 160
= 1 − P (X1 ≤ 160)P (X2 ≤ 160) · · · P (X10 ≤ 160)
=1− 160
180 10 = 0.692. 2. For all positive integer n, we have that
P 2n = 10
01 and P 2n+1 = 01
.
10 Therefore, {Xn : n = 0, 1, . . . } is not regular. 3. By drawing a transition graph, it can be readily seen that, if states 0, 1, 2, 3, and 4 are renamed
0, 4, 2, 1, and 3, respectively, then the transition probability matrix P 1 will change to P 2 . 4. Let Z be the number of transitions until the ﬁrst visit to 1. Clearly, Z is a geometric random
variable with parameter p = 3/5. Hence its expected value is 1/p = 5/3. 5. By drawing a transition graph, it is readily seen that this Markov chain consists of two recurrent
classes {3, 5} and {4}, and two transient classes {1} and {2}. 6. We have that
Xn+1 = Xn
1 + Xn if the (n + 1)st outcome is not 6
if the (n + 1)st outcome is 6. This shows that {Xn : n = 1, 2, . . . } is a Markov chain with state space {0, 1, 2, . . . }. Its
transition probability matrix is given by
⎛
⎞
5/6 1/6 0
0
0 ...
⎜ 0 5/6 1/6 0
0 . . .⎟
⎜
⎟
⎜0
0 5/6 1/6 0 . . .⎟ .
P =⎜
⎟
⎜0
0
0 5/6 1/6 . . .⎟
⎝
⎠
.
.
.
All states are transient; no two states communicate with each other. Therefore, we have
inﬁnitely many classes; namely, {0}, {1}, {2}, . . . , and each one of them is transient. 332 Chapter 12 Stochastic Processes 7. The desired probability is
p11 p11 + p11 p12 + p12 p22 + p12 p21 + p21 p11 + p21 p12 + p22 p21 + p22 p22
= (0.20)2 + (0.20)(0.30) + (0.30)(0.15) + (0.30)(0.32)
+ (0.32)(0.20) + (0.32)(0.30) + (0.15)(0.32) + (0.15)2 = 0.4715. 8. The following is an example of such a transition probability matrix:
⎛
00
⎜1 0
⎜
⎜0 0
⎜
⎜0 1/2
P =⎜
⎜0 0
⎜
⎜0 0
⎜
⎝0 0
00 1
0
0
0
0
0
0
0 00
0
00
0
10
0
0 1/2 0
0 1/3 2/3
00
0
00
0
00
1 0
0
0
0
0
1
0
0 ⎞
0
0⎟
⎟
0⎟
⎟
0⎟
⎟.
0⎟
⎟
0⎟
⎟
1⎠
0 9. For n ≥ 1, let
Xn = 1 if the nth golfball produced is defective 0 if the nth golfball produced is good. Then {Xn : n = 1, 2, . . . } is a Markov chain with state space {0, 1} and transition probability
15/18 3/18
matrix 11/12 1/12 . Let π0 be the fraction of golfballs produced that are good, and π1 be
the fraction of the balls produced that are defective. Then, by Theorem 12.7, π0 and π1 satisfy
π0
π1 = 15/18 11/12
3/18 1/12 π0
,
π1 which gives us the following system of equations
⎧
⎨π0 = (15/18)π0 + (11/12)π1
⎩π = (3/18)π + (1/12)π .
1
0
1
By choosing any one of these equations along with the relation π0 + π1 = 1, we obtain a
system of two equations in two unknowns. Solving that system yields
π0 = 11
≈ 0.85
13 and π1 = 2
≈ 0.15.
13 Therefore, approximately 15% of the golfballs produced have no logos. 10. Let ⎧
⎪1
⎨
Xn = 2
⎪
⎩
3 if the nth ball is drawn by Carmela
if the nth ball is drawn by Daniela
if the nth ball is drawn by Lucrezia. Chapter 12 Review Problems 333 The process {Xn : n = 1, 2, . . . } is an irreducible, aperiodic, positive recurrent Markov chain
with transition probability matrix
⎛
⎞
7/31 11/31 13/31
P = ⎝7/31 11/31 13/31⎠ .
7/31 11/31 13/31
Let π1 , π2 , and π3 be the longrun proportion of balls drawn by Carmela, Daniela, and Lucrezia,
respectively. Intuitively, it should be clear that these quantities are 7/31, 11/31, and 13/31,
respectively. However, that can be seen also by solving the following matrix equation along
with π0 + π1 + π3 = 1.
⎛⎞ ⎛
⎞⎛ ⎞
π1
7/31 7/31 7/31
π1
⎝π2 ⎠ = ⎝11/31 11/31 11/31⎠ ⎝π2 ⎠ .
13/31 13/31 13/31
π3
π3 11. Let π1 and π2 be the longrun probabilities that Francesco devotes to playing golf and playing
tennis, respectively. Then, by Theorem 12.7, π1 and π2 are obtained from solving the system
of equations
π1
0.30 0.58
π1
=
0.70 0.42
π2
π2
along with π1 + π2 = 1. The matrix equation above gives the following system of equations:
π1 = 0.30π1 + 0.58π2
π2 = 0.70π1 + 0.42π2 .
By choosing any one of these equations along with the relation π1 + π2 = 1, we obtain
a system of two equations in two unknowns. Solving that system yields π1 = 0.453125
and π2 = 0.546875. Therefore, the longrun probability that, on a randomly selected day,
Francesco plays tennis is approximately 0.55. 12. Suppose that a train leaves the station at t = 0. Let X1 be the time until the ﬁrst passenger arrives at the station after t = 0. Let X2 be the additional time it will take until a train arrives
at the station, X3 be the time after that until a passenger arrives, and so on. Clearly, X1 ,
X2 , . . . are the times between consecutive change of states. By the memoryless property
of exponential random variables, {X1 , X2 , . . . } is a sequence of independent and identically
distributed exponential random variables with mean 1/λ. Hence, by Remark 7.2, N (t) : t ≥
0 is a Poisson process with rate λ. Therefore, N(t) is a Poisson random variable with
parameter λt . 13. Let X(t) be the number of components working at time t . Clearly, X (t) : t ≥ 0 is a
continuoustime Markov chain with state space {0, 1, 2}. Let π0 , π1 , and π2 be the longrun
proportion of time the process is in states 0, 1, and 2, respectively. The balance equations for
X (t) : t ≥ 0 are as follows: 334 Chapter 12 Stochastic Processes State Input rate to = Output rate from 0 λπ1 = µπ0 1 2λπ2 + µπ0 = µπ1 + λπ1 2 µπ1 = 2λπ2 µ
µ2
π0 and π2 = 2 π0 . Using π0 + π1 + π2 = 1 yields
λ
2λ
2λ2
π0 = 2
.
2λ + 2λµ + µ2 From these equations, we obtain π1 = Hence the desired probability is
1 − π0 = µ(2λ + µ)
.
+ 2λµ + µ2 2λ2 14. Suppose that every time an outoforder machine is repaired and is ready to operate a birth
occurs. Suppose that a death occurs every time that a machine breaks down. The fact that
X (t) : t ≥ 0 is a birth and death process should be clear. The birth and death rates are
⎧
⎪kλ
n = 0, 1, . . . , m + s − k
⎪
⎨
λn = (m + s − n)λ n = m + s − k + 1, m + s − k + 2, . . . , m + s
⎪
⎪
⎩
0
n ≥ m + s;
⎧
⎪nµ n = 0, 1, . . . , m
⎪
⎨
µn = mµ n = m + 1, m + 2, . . . , m + s
⎪
⎪
⎩
0
n > m + s. 15. Let X(t) be the number of machines operating at time t . For 0 ≤ i ≤ m, let πi be the longrun
proportion of time that there are exactly i machines operating. Suppose that a birth occurs
each time that an outoforder machine is repaired and begins to operate, and a death occurs
each time that a machine breaks down. Then X (t) : t ≥ 0 is a birth and death process with
state space {0, 1, . . . , m}, and birth and death rates, respectively, given by λi = (m − i)λ and
µi = iµ for i = 0, 1, . . . , m. To ﬁnd π0 , ﬁrst we will calculate the following sum:
m
i =1 λ0 λ1 · · · λi −1
=
µ1 µ2 · · · µi m
i =1
m =
i =1 (mλ) (m − 1)λ (m − 2)λ · · · (m − i + 1)λ
µ(2µ)(3µ) · · · (iµ)
λi
=
i ! µi m Pi m = −1 +
i =0 m
i m
i =1 m
i λ
µ i λ i m−i
λ
1
= −1 + 1 +
µ
µ m , Chapter 12 Review Problems 335 where m Pi is the number of i element permutations of a set containing m objects. Hence, by
(12.22),
m
λ −m
λ + µ −m
µ
π0 = 1 +
=
=
.
µ
µ
λ+µ
By (12.21),
πi = i
λ0 λ1 · · λi −1
m Pi λ
π0 =
π0
µ1 µ2 · · · µi
i ! µi µ
λ+µ = m
i λ
µ = m
i λ
λ+µ i i m 1− = m
i λ
λ+µ λ
µ
m−i , µ
λ+µ i µ
λ+µ i m−i 0 ≤ i ≤ m. Therefore, in steadystate, the number of machines that are operating is binomial with parameters m and λ/(λ + µ). 16. Let X(t) be the number of cars at the center, either being inspected or waiting to be inspected,
at time t . Clearly, X (t) : t ≥ 0 is a birth and death process with rates λn = λ/(n + 1),
n ≥ 0, and µn = µ, n ≥ 1. Since
∞
n=1 λ0 λ1 · · · λn−1
=
µ1 µ2 · · · µn ∞ λ· n=1 λλ
λ
· ···
∞
1λ
23
n
= −1 +
n
µ
n! µ
n=0 n = eλ/µ − 1. By (12.18), π0 = e−λ/µ . Hence, by (12.17), πn = λ· λ
λλ
· ···
23
n −λ/µ (λ/µ)n e−λ/µ
,
e
=
µn
n! n ≥ 0. Therefore, the longrun probability that there are n cars at the center for inspection is Poisson
with rate λ/µ. 17. Let X(t) be the population size at time t . Then X (t) : t ≥ 0 is a birth and death process with birth rates λn = nλ, n ≥ 1, and death rates µn = nµ, n ≥ 0. For i ≥ 0, let Hi be the time,
starting from i , until the population size reaches i + 1 for the ﬁrst time. We are interested in
4
i =1 E(Hi ). Note that, by Lemma 12.2,
E(Hi ) = 1
µi
+ E(Hi −1 ),
λi
λi Since E(H0 ) = 1/λ,
E(H1 ) = 1 µ1
1
µ
+ · = + 2,
λ
λλ
λλ i ≥ 1. 336 Chapter 12 Stochastic Processes E(H2 ) = 1
1
µ2
2µ
1
µ
µ
+
·
+2=
+ 2 + 3,
2λ
2λ
λλ
2λ λ
λ E(H3 ) = 1
1
µ2
µ2 µ3
3µ 1
µ
µ
+
+ 2+ 3 =
+ 2 + 3 + 4,
3λ
3λ 2λ λ
λ
3λ 2λ
λ
λ E(H4 ) = 1
4µ 1
µ
µ
µ2 µ3
1
µ2
µ3 µ4
+
+ 2+ 3+ 4 =
+ 2 + 3 + 4 + 5.
4λ
4λ 3λ 2λ
λ
λ
4λ 3λ
2λ
λ
λ Therefore, the answer is
4 E(Hi ) =
i =1 25λ4 + 34λ3 µ + 30λ2 µ2 + 24λµ3 + 12µ4
.
12λ5 18. Let X(t) be the population size at time t . Then X (t) : t ≥ 0 is a birth and death process with rates λn = γ , n ≥ 0, and µn = nµ, n ≥ 1. To ﬁnd πi ’s, we will ﬁrst calculate the sum
in the relation (12.18):
∞
n=1 λ0 λ1 · · · λn−1
=
µ1 µ2 · · · µn ∞
n=1 ∞ γn
1γ
= −1 +
n
n! µ
n! µ
n=0 n = −1 + eγ /µ . Thus, by (12.18), π0 = e−γ /µ and, by (12.17), for i ≥ 1,
πi = γ n −γ /µ (γ /µ)n e−γ /µ
e
=
.
n! µn
n! Hence the steadystate probability mass function of the population size is Poisson with parameter γ /µ. 19. By applying Theorem 12.9 to Y (t) : t ≥ 0 with t1 = 0, t2 = t , y1 = 0, y2 = y , and t = s ,
we have
E Y (s)  Y (t) = y = 0 +
and
Var Y (s)  Y (t) = y = σ 2 · s
y−0
(s − 0) = y,
t −0
t (t − s)(s − 0)
s
= σ 2 (t − s) .
t −0
t 20. First, suppose that s < t . By Example 10.23,
E X (s)X(t)  X(s) = X(s)E X (t)  X(s) .
Now, by Exercise 7, Section 12.5,
E X (t)  X(s) = X(s). Chapter 12 Review Problems 337 Hence
E X (s)X(t) = E E X (s)X(t)  X(s)
= E X (s)E X (t)  X(s)
= E X (s)X(s) = E X (s)2
= Var X (s) + E X (s) 2 = σ 2 s + 0 = σ 2 s.
For t < s, by symmetry,
E X (s)X(t) = σ 2 t.
Therefore,
E X (s)X(t) = σ 2 min(s, t). 21. By Theorem 12.10,
P (U < x and T > y) = P no zeros in (x, y) = 1 − 22. Let the current price of the stock, per share, be v0 . Noting that √ 2
arccos
π x
.
y 27.04 = 5.2, we have V (t) = v0 e3t +5.2W (t) ,
where W (t) : t ≥ 0 is a standard Brownian motion. Hence W (t) ∼ N (0, t). The desired
probability is calculated as follows:
P V (2) ≥ 2v0 = P v0 e6+5.2W (2) ≥ 2v0
= P 6 + 5.2W (2) ≥ ln 2 = P W (2) ≥ −1.02
=P W (2) − 0
≥ −0.72
√
2 = P (Z ≥ −0.72) = 1 − P (Z < −0.72)
=1− (−0.72) = 0.7642. ...
View
Full
Document
This note was uploaded on 12/19/2009 for the course INDUSTRIAL ie500 taught by Professor Mathematicalprogrammibg during the Spring '09 term at Bilkent University.
 Spring '09
 mathematicalprogrammibg

Click to edit the document details