This preview shows page 1. Sign up to view the full content.
Unformatted text preview: Theorem 4.13). Let w 0 and g 0 be the relative gain and gain per
stage for k 0 and let u be an arbitrary ﬁnal reward vector.
0
0
0
0
a) Let k 0 = (k1 , k2 , ..., kM ). Show that for each i and each k 6= ki , there is some α > 0 such
0,
that for each i and k 6= ki
0
(ki ) ri + X
j (k0 ) 0
k
Pij i wj ≥ +ri + X k0
Pij wj + α. j Hint: Look at the proof of Lemma 4.5
b) Show that there is some n0 such that for all n ≥ n0 ,
Ø
Ø
Ø∗
Ø
0
Øvj (n − 1) − (n − 1)g 0 − wj − β (u )Ø < α/2 where β (u ) is given in Theorem 4.13. c) Use part b) to show that for all i and all n ≥ n0 ,
0
(ki ) ri + X
j (k0 ) 0
(ki ) ∗
Pij i vj (n − 1) > +ri + X
j (k0 ) 0
Pij i wj + (n − 1)g 0 + β (u ) − α/2. 0
d) Use parts a) and b) to show that for all i, all n ≥ n0 , and all k 6= ki ,
k
ri + X
j 0
(ki ) k∗
Pij vj (n − 1) < +ri + X
j (k0 ) 0
Pij i wj + (n − 1)g 0 + β (u ) − α/2. e Combine parts c) and d) to conclude that the optimal dynamic policy uses policy k 0 for
all n ≥ n0 .
Exercise 4.36. Consider an integer time queueing system with a ﬁnite buﬀer of size 2. At
the beginning of the nth time interval, the queue contains at most two customers. There
is a cost of one unit for each customer in queue (i.e., the cost of delaying that customer).
If there is one customer in queue, that customer is served. If there are two customers, an
extra server is hired at a cost of 3 units and both customers are served. Thus the total
immediate cost for two customers in queue is 5, the cost for one customer is 1, and the cost
for 0 customers is 0. At the end of the nth time interval, either 0, 1, or 2 new customers
arrive (each with probability 1/3).
a) Assume that the system starts with 0 ≤ i ≤ 2 customers in queue at time −1 (i.e., in
stage 1) and terminates at time 0 (stage 0) with a ﬁnal cost u of 5 units for each customer
in queue (at the beginning of interval 0). Find the expected aggregate cost vi (1, u ) for
0 ≤ i ≤ 2. 4.8. EXERCISES 195 b) Assume now that the system starts with i customers in queue at time −2 with the same
ﬁnal cost at time 0. Find the expected aggregate cost vi (2, u ) for 0 ≤ i ≤ 2.
c) For an arbitrary starting time −n, ﬁnd the expected aggregate cost vi (n, u ) for 0 ≤ i ≤ 2.
d) Find the cost per stage and ﬁnd the relative cost (gain) vector.
e) Now assume that there is a decision maker who can choose whether or not to hire the
extra server when there are two customers in queue. If the extra server is not hired, the
3 unit fee is saved, but only one of the customers is served. If there are two arrivals in
this case, assume that one is turned away at a cost of 5 units. Find the minimum dynamic
∗
aggregate expected cost vi (1), 0 ≤ i ≤, for stage 1 with the same ﬁnal cost as before.
∗
f ) Find the minimum dynamic aggregate expected cost vi (n, u ) for stage n, 0 ≤ i ≤ 2. g) Now assume a ﬁnal cost u of one unit per customer rather than 5, and ﬁnd the new
∗
minimum dynamic aggregate expected cost vi (n, u ), 0 ≤ i ≤ 2.
Exercise 4.37. Consider a ﬁnitestate ergodic Markov chain {Xn ; n ≥ 0} with an integer
valued set of states {−K, −K +1, . . . , −1, 0, 1, . . . , +K }, a set of transition probabilities
Pij ; −K ≤ i, j ≤ K , and initial state X0 = 0. One example of such a chain is given by:
0.9 0.9 0.9 ✎
✄
✎
✄
✎
✄
♥ 0.1✲ 0
♥ 0.1✲ 1
♥
1 ② 0.1 P
Let {Sn ; n ≥ 0} be a stochastic process with Sn = n Xi . Parts (a), (b), and (c) are
i=0
independent of parts (d) and (e). Parts (a), (b), and (c) should be solved both for the
special case in the above graph and for the general case.
a) Find limn→1 E [Xn ] for the example and express limn→1 E [Xn ] in terms of the steadystate probabilities of {Xn , n ≥ 0} for the general case.
b) Show that limn→1 Sn /n exists with probability one and ﬁnd the value of the limit.
Hint: apply renewalreward theory to {Xn ; n ≥ 0}.
c) Assume that limn→1 E [Xn ] = 0. Find limn→1 E [Sn ].
d) Show that
Pr {Sn =sn  Sn−1 =sn−1 , Sn−2 =sn−2 , Sn−3 =sn−3 , . . . , S0 =0} =
Pr {Sn =sn  Sn−1 =sn−1 , Sn−2 =sn−2 } .
e) Let Y n = (Sn , Sn−1 ) (i. e., Y n is a random two dimensional integer valued vector).
Show that {Y n ; n ≥ 0} (where Y 0 = (0, 0)) is a Markov chain. Describe the transition
probabilities of {Y n ; n ≥ 0} in terms of {Pij }. 196 CHAPTER 4. FINITESTATE MARKOV CHAINS Exercise 4.38. Consider a Markov decision problem with M states in which some state,
say state 1, is inherently reachable from each other state.
a) Show that there must be some other state, say state 2, and some decision, k2 , such that
(k )
P21 2 > 0.
b) Show that there must be some other state, say state 3, and some decision, k3 , such that
(k )
(k )
either P31 3 > 0 or P32 3 > 0.
c)Assume, for some i, and some set of decisions k2 , . . . , ki that, for each j , 2 ≤ j ≤ i,
(k )
Pj l j > 0 for some l < j (i.e., that each state from 2 to j...
View
Full
Document
This note was uploaded on 09/27/2010 for the course EE 229 taught by Professor R.srikant during the Spring '09 term at University of Illinois, Urbana Champaign.
 Spring '09
 R.Srikant

Click to edit the document details