TV1982(2) - VolumeXXVII T H E O R Y OF PROBABILITY AND ITS...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: VolumeXXVII T H E O R Y OF PROBABILITY AND ITS APPLICATIONS 1982 Number3 CONTROLLED MARKOV PROCESSES WITH ARBITRARY NUMERICAL CRITERIA E. A. FAINBERG (Translatedby W. U.Sirk 1. Introduction Inthe theory of controlled Markov processes with discrete time we study, as a rule, controlled processes either with the total reward criterion or with criteriaformean reward perunittime. The theory of controlled Markov processes with Borel state and control spaces inthe case of the totalreward criterionwas developed by D. Blackwell [1],[2] and R.Strauch[3].In [4]-[6]thebasicresultsoftheinvestigations[1]-[3] were extended to nonhomogeneous models, the foundations of the theory of which were laidin[7]-[9].The study ofcontrolledprocesses with mean reward criteriawas started at the end ofthe 1950s. The firstfundamental resultswere obtained in the publications by R. Howard [10],C. Derman [11],[12],O. V. Viskov and A. N. Shiryaev [13]. Inaddition, there existpublications (forexample, [14]-[16])inwhich con- trolledprocesseswithothernumericalcriteriaareinvestigated.There alsoexists a number of works in which the value of the criterion constitutes a finite- dimensionalorinfinite-dimensionalvector (forexample, [17]-[19]),orthevalue of the criterion is not computed, but a rule is given according to which one strategyismore preferable than others (forexample, [19]-[23]). In connection with the existing variety of criteria and methods of their investigation, the problem arises, concerning development of general methods fortheinvestigationofallor individualgroups ofcriteria. One such group of criteria, the so-called expected utility criteria, were studied in [27], [34]-[38]. In this case the criterion is the expectation of a functional specified on the trajectory space of the process. The total reward criterion is a particular case of the expected utility criterion. When expected utilitycriteriaare investigated, additionalconditionsto those of [1]-[3] guaran- teeing existence of optimal strategies are imposed as a rule on the model. Regrettably, criteria of mean reward per single step are not expected utility criteria. In the present paper we consider nonhomogeneous Borel models with discrete time and nonbounded horizon. We investigate arbitrary numerical criteria,i.e.,criteriathevaluesofwhich aregivenbynumericalfunctionsdefined 486 M A R K O V PROCESSES WITH N U M E R I C A L CRITERIA 487 on thespace ofstrategicmeasures. We introducethreepropertiesofa criterion" measurability,convexityanddecomposability(Definitions2.1-2.3).We establish thatfrom thesepropertiesofacriterionfollowstheexistenceofnonrandomized strategiesand nonrandomized Markov strategies,whileinthecaseofa specified initialmeasure therefollowsexistenceofnonrandomized Markov strategiesthat arecloseto optimalstrategies.Thus, inthecaseofaparticularcriterion,forthe proof of existence of the strategies mentioned above, itissufficient to verify thatthecriterionpossessescertainproperties....
View Full Document

{[ snackBarMessage ]}

Page1 / 18

TV1982(2) - VolumeXXVII T H E O R Y OF PROBABILITY AND ITS...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online