This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: T H E O R Y PROBAB. APPL. Vol. 31, No. Translated from Russian Journal SUFFICIENT CLASSES OF STRATEGIES IN DISCRETE DYNAMIC PROGRAMMING I: DECOMPOSITION OF RANDOMIZED STRATEGIES AND EMBEDDED MODELS* E. A. FAINBERG (Translatedby Merle Ellis) 1. Introduction. One ofthemajor questionsthatoccurs ininvestigatingproblems of dynamic programming on an infinite time interval is" in which natural classes of strategies do there exist strategiesthatproduce a payoff uniformly close to the pure value? Itisknown that in the case of finite state and control sets,optimal stationary strategiesexist 1](thisalso followsfrom [2]). However,ifthesetofstatesor controls isinfinite,then optimal (and even eoptimal) stationary strategiesneed not exist [3], [4,Chap. 6, 6,Example 2]. Inthe case ofa countable state space X, a very general resultwas announced in [5, Theorem 2.1]that gives a description of such classes. Let F denote the set of all mappings F: X 2 x such that x F(x) for all x X. A nonrandomized strategy is called an Fstrategyifin any current state x the control ischosen depending only on this state and the total time passed in F(x) prior to the current instant of time. Markovian and tracking [6] strategies are special cases of Fstrategies. Let X* be a state set where the pure value is different from 0, or the pure value is equal to zero but controls exist on which the value of the optimality operator applied to the pure value is attained. According to Theorem 2.1 formulated in [5],for any F F there exists an Fstrategythatfor allinitialprehistories produces a payoffuniformly close to the pure value (persistently eloptimal strategy; a precise definition of persistently eloptimalstrategiesisgivenin 6) and isstationaryon X*. This theorem, on theone hand, elaborates the results of Everett [7] and Chitashvili [8],[9] on the sufficiency of strategies that are stationary on subsets in case X is finite; on the other hand, it extendsinvarious ways resultson thesufficiencyofstationarystrategies[3],[10][12] and the sufficiency of Markov strategies [13][16], and gives a positive answer to a problem raised in [17]regarding the sufficiency oftracking strategies. This article uses one ofthe two schemes suggested in [5]to prove Theorem 2.1 of [5].We prove a stronger result (Theorem 6.2)thatgeneralizes Theorem 2.1 in [5] in the following three directions: (1)a broader state set than X* isgiven where one can confine oneselfto stationarycontrols; (2)wherever one cannot confine oneselfto stationary controls, one can choose nonstationary strategies not only among the Fstrategies but also from other natural classes of strategies; (3) the existence of persistently eloptimal strategies is proved for functions in a broader class than in Theorem 2.1 of [5]....
View
Full
Document
This note was uploaded on 12/06/2011 for the course MATH 101 taught by Professor Eugenea.feinberg during the Fall '11 term at State University of New York.
 Fall '11
 EugeneA.Feinberg
 Dynamic Programming

Click to edit the document details