# TV1986 - T H E O R Y PROBAB APPL Vol 31 No Translated from...

This preview shows pages 1–2. Sign up to view the full content.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: T H E O R Y PROBAB. APPL. Vol. 31, No. Translated from Russian Journal SUFFICIENT CLASSES OF STRATEGIES IN DISCRETE DYNAMIC PROGRAMMING I: DECOMPOSITION OF RANDOMIZED STRATEGIES AND EMBEDDED MODELS* E. A. FAINBERG (Translatedby Merle Ellis) 1. Introduction. One ofthemajor questionsthatoccurs ininvestigatingproblems of dynamic programming on an infinite time interval is" in which natural classes of strategies do there exist strategiesthatproduce a pay-off uniformly close to the pure value? Itisknown that in the case of finite state and control sets,optimal stationary strategiesexist 1](thisalso followsfrom [2]). However,ifthesetofstatesor controls isinfinite,then optimal (and even e-optimal) stationary strategiesneed not exist [3], [4,Chap. 6, 6,Example 2]. Inthe case ofa countable state space X, a very general resultwas announced in [5, Theorem 2.1]that gives a description of such classes. Let F denote the set of all mappings F: X 2 x such that x F(x) for all x X. A nonrandomized strategy is called an F-strategyifin any current state x the control ischosen depending only on this state and the total time passed in F(x) prior to the current instant of time. Markovian and tracking [6] strategies are special cases of F-strategies. Let X* be a state set where the pure value is different from 0, or the pure value is equal to zero but controls exist on which the value of the optimality operator applied to the pure value is attained. According to Theorem 2.1 formulated in [5],for any F F there exists an F-strategythatfor allinitialprehistories produces a pay-offuniformly close to the pure value (persistently el-optimal strategy; a precise definition of persistently el-optimalstrategiesisgivenin 6) and isstationaryon X*. This theorem, on theone hand, elaborates the results of Everett [7] and Chitashvili [8],[9] on the sufficiency of strategies that are stationary on subsets in case X is finite; on the other hand, it extendsinvarious ways resultson thesufficiencyofstationarystrategies[3],[10]-[12] and the sufficiency of Markov strategies [13]-[16], and gives a positive answer to a problem raised in [17]regarding the sufficiency oftracking strategies. This article uses one ofthe two schemes suggested in [5]to prove Theorem 2.1 of [5].We prove a stronger result (Theorem 6.2)thatgeneralizes Theorem 2.1 in [5] in the following three directions: (1)a broader state set than X* isgiven where one can confine oneselfto stationarycontrols; (2)wherever one cannot confine oneselfto stationary controls, one can choose nonstationary strategies not only among the F-strategies but also from other natural classes of strategies; (3) the existence of persistently el-optimal strategies is proved for functions in a broader class than in Theorem 2.1 of [5]....
View Full Document

## This note was uploaded on 12/06/2011 for the course MATH 101 taught by Professor Eugenea.feinberg during the Fall '11 term at State University of New York.

### Page1 / 11

TV1986 - T H E O R Y PROBAB APPL Vol 31 No Translated from...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online