TV1982(1) - T H E O R Y OF PROBABILITY Vot,mXXVH A N D IT S...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: T H E O R Y OF PROBABILITY Vot,,mXXVH A N D IT S A PP L I C A TI N S m t 1982 NON-RANDOMIZED MARKOV AND SEMI-MARKOV STRATEGIES IN DYNAMIC PROGRAMMING E. A. FAINBERG (Translatedby W. U.Sirk 1. Introduction In a non-homogeneous controllable Markov model with a total reward criterion,discretetime, infinitehorizon and Borel spaces ofstates and controls, let a certain strategy 7r and an initial measure /x be given. In the paper the followingtwo statements areproved: (a) (Theorem 3) for any K < +oo, there exists a non-randomized Markov strategyq such that > w(, 7r) if w(/x,rr)<+ , 1) w (/., K if w(tx,7r)= (b) (Theorem 4)forany measurable function K (x)<+oo given on a set of initial states X0, there exists a non-randomized semi-Markov strategy q’ such that,for any x X0, > J w(x,r) if w(x,7r)< +o, (2) w(x, q ) [ K (x), ifw (x, r) +c. The quantities w(/,r) and w(x,7r)are the expectations of totalreward inthe caseofthestrategy 7r and initialmeasure/x, and initialstatex,respectively. ControllableMarkov models with Borelstatespaces,aswellasproblems of existenceofMarkov and semi-Markov strategiesinsuchmodels which majorize arbitrary strategies, were studied for the first time by Blackwall [1], [2]. These investigations were continued by Strauch [3], where three cases were considered: positive (P) and negative (N) dynamic programming, as well as dynamic programming with discounting (D). For the cases D and N it was proved, asone ofthefundamentalresultsoftheinvestigation[3],Theorem 4.3], that non-randomized Markov strategies q and semi-Markov strategies q’ such that w (ix,q)-> w (/x,r) and w (x,o’) => w (x,r) for allinitial states x exist. In all three cases, D, N and P,itwas assumed in[3]that w (, r)< +o for all/x and zr, and inviewofthistheconstantK andthefunctionK (x)were not considered. For the case P (cf. [3],Theorem 4.4), existence of non-randomized Markov strategiesq and semi-Markov strategiesq’,suchthat w (, 0)->w (/x,zr)-e and w(x,o’)>=w(x,zr)-e for allinitialstatesx,was proved for any e >0. In [3]it 116 N O N - R A N D O M I Z E D M A R K O V A N D S E M I - M A R K O V STRATEGIES 117 was pointed out that itisnot known whether the last result istrue for e 0. (We note thatinthe formulation ofthe problem itwas assumed in [3]that the initialmeasure isconcentratedatasinglepoint.The caseofan arbitraryinitial measure/x, for the firsttime considered by Hinderer [4], does not introduce additional difficulties.) Homogeneous models were considered in [1]-[3]. The concept of non- homogeneous controllablemodels arose asaresultoftheinvestigations [5]-[7]. In [4],[8] and [9]a considerable partoftheinvestigations [1]-[3] was extended tothecaseofnon-homogeneousmodels,with abroaderclassofincomefunctions beinginvestigatedin [4] and [9]than in[1]-[3].Forweak conditionstheresults onexistenceofanon-randomizedMarkovstrategyinthenon-homogeneouscase, which majorizesan arbitrarystrategy,ispresentedin[9]Chapt. 5, 1,Statement II.Alsothere, forthecasew(/x, 7r) < +, thequestion...
View Full Document

This note was uploaded on 12/06/2011 for the course MATH 101 taught by Professor Eugenea.feinberg during the Fall '11 term at State University of New York.

Page1 / 11

TV1982(1) - T H E O R Y OF PROBABILITY Vot,mXXVH A N D IT S...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online