TV1978 - VolumeXXHI T H E O R Y OF PROBABILITY A N D ITS...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: VolumeXXHI T H E O R Y OF PROBABILITY A N D ITS APPLICATIONS 1978 Number2 THE EXISTENCE OF A STATIONARY e-OPTIMAL POLICY FOR A FINITE MARKOV CHAIN E. A. FAINBERG (TranslatedbyK.Durr) Inthis paperwe investigatetheproblemofoptimalcontrolofaMarkov chain with a finite number of states when the control sets are compact in the metric space. The goal ofthe control isto maximize the average reward per unit step. For the case of finite control and state sets the existence of a stationary optimal policy was proved in [1] and [2]. In [3]-[5] it was proved that for a controlled Markov process with finite state space, compact control sets and continuous rewardandtransitionfunctionstheremay not existan optimalpolicy. In this paper it is proved that ifthe state space isfinite, the control sets are compact, the transition functions are continuous and the reward functions are uppersemicontinuous,then foranypositivee there existsastationarye-optimal policy.By the average reward one can understand here thelower as well as the upperlimitoftheaveragerewardperunitstep.Forthecaseofthelower limitthe existence ofthestationarye-optimal policywas proved in [4]. Examplesin[3]and[4]showthatiftheabove restrictionsarenotsatisfiedon thecontrolsets,thetransitionfunctions and therewardfunctions,theremay not exist a stationary e-optimal policy for some positive e. Ifthe state space isnot finitethen, as shows theexample in [6],theremay not be astationarye-optimal policyeven inthecaseoffinitecontrolsets.Observethatifthenumberofstatesis two,then,accordingto [7], undertheassumptionsmade inthispaperthereexistsa stationaryoptimalpolicy. In[7]-[9]were studied sufficient conditions forthe existence ofstationary optimalpoliciesimposingcertainadditional(inrelationtotherequirementsofthe present paper) restrictions on the control sets. In [8] it was proved that for compactconvexcontrolsetscoincidingwiththesetsoftransitionprobabilitiesand concavecontinuousrewardfunctionsthereexistsastationaryoptimalpolicyifany stationarypolicydefinesanergodicMarkovchainwithouttransientstates.In[9]it wasshown thatundertheconditionderivedin[8]itissufficienttorequirethatnot any butratherthatatleastone stationarypolicydefine an ergodicMarkov chain without transientstates. In[7]twosufficientconditionswere given.Theseconditionsconsistinthefact thatinadditiontotheassumptionsofthepresentpaperone shouldaddoneofthe followingrestrictions: (i)any stationarypolicydefines an ergodicMarkov chain with one ergodicclassand possiblywith transientstates;(ii)foreachstatetheset oftransition probabilitiescontains a finitenumber of extreme points. 297 298 F. A. FAINBERG 1. Basic Definitions Let X {1, 2,..., s} be the state space. For each state xX letthere be given a control set Ax....
View Full Document

This note was uploaded on 12/06/2011 for the course MATH 101 taught by Professor Eugenea.feinberg during the Fall '11 term at State University of New York.

Page1 / 17

TV1978 - VolumeXXHI T H E O R Y OF PROBABILITY A N D ITS...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online