This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: MMOR manuscript No. (will be inserted by the editor) ON ESSENTIAL INFORMATION IN SEQUENTIAL DECISION PROCESSES Eugene A. Feinberg Department of Applied Mathematics and Statistics; State University of New York; Stony Brook, NY 11794-3600; USA; [email protected] ?? The date of receipt and acceptance will be inserted by the editor Abstract This paper provides sufficient conditions when certain informa- tion about the past of a stochastic decision processes can be ignored by a controller. We illustrate the results with particular applications to queueing control, control of semi-Markov decision processes with iid sojourn times, and uniformization of continuous-time Markov decision processes. 1 Introduction The results of this paper are based on the following simple observation. If each state of a controlled stochastic system consists of two coordinates and neither the transition mechanism for the first coordinate nor costs depend on the second coordinate, the controller can ignore the second coordinate values. Theorem 2 and Corollary 4 present the appropriate formulations for discrete-time and for continuous-time jump problems respectively and for various performance criteria including average costs per unit time and total discounted costs. These statements indicate that additional information, presented in the second coordinate, cannot be used to improve the system performance. Though these facts are simple and general, they are useful for the analysis of various particular problems. This paper is motivated by two groups of applications: (i) controlled queues and (ii) uniformization of Continuous-Time Markov Decision Processes (CTMDP). We illustrate our motivation in the introduction with one of such applications. Additional examples are presented in Section 4. Consider the problem of routing to parallel queues with the known work- load; see the last example “Policies based on queue length and workload” in ?? Supported in part by grant DMI-0300121 from the National Science Founda- tion 2 Eugene A. Feinberg Koole . Customers arrive according to a Poisson process into a system consisting of m homogeneous restless queues. The service times of arriving customers are not known at the arrival epochs but the state of the system, that an arrival sees, is known. This state includes the workloads in all m queues. The costs depend only on the workload vector. In particular, the m-dimensional vectors of workloads and numbers of customers in queues are known. We denote them by w and ‘ respectively. Since nothing depends on ‘ , this coordinate was dropped in  and the state of the system was reduced to the the workload vector w. If the service discipline is FIFO or all m queues use deterministic service disciplines, then the sufficiency of using w as the state space follows from the following simple arguments. Let W n- 1 be the workload vector that the n th arrival sees, n = 0 , 1 ,..., and W is the null vector. Then the sequence W 1 ,...,W,....
View Full Document