FeinbergLewisMOR2007 (1)


Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
MATHEMATICS OF OPERATIONS RESEARCH Vol. 32, No. 4, November 2007, pp. 769–783 issn 0364-765X ± eissn 1526-5471 ± 07 ± 3204 ± 0769 inf orms ® doi 10.1287/moor.1070.0269 © 2007 INFORMS Optimality Inequalities for Average Cost Markov Decision Processes and the Stochastic Cash Balance Problem Eugene A. Feinberg Department of Applied Mathematics & Statistics, State University of New York at Stony Brook, Stony Brook, New York 11794, efeinberg@notes.cc.sunysb.edu , http://www.ams.sunysb.edu/ ~ feinberg/ Mark E. Lewis School of Operations Research & Industrial Engineering, Cornell University, 226 Rhodes Hall, Ithaca, New York 14853, mark.lewis@cornell.edu , http://www.orie.cornell.edu/orie/people/faculty/pro±le.cfm?netid=mark.lewis For general state and action space Markov decision processes, we present suf±cient conditions for the existence of solutions of the average cost optimality inequalities. These conditions also imply the convergence of both the optimal discounted cost value function and policies to the corresponding objects for the average costs per unit time case. Inventory models are natural applications of our results. We describe structural properties of average cost optimal policies for the cash balance problem; an inventory control problem where the demand may be negative and the decision-maker can produce or scrap inventory. We also show the convergence of optimal thresholds in the ±nite horizon case to those under the expected discounted cost criterion and those under the expected discounted costs to those under the average costs per unit time criterion. Key words : Markov decision process; average cost per unit time; optimality inequality; optimal policy; inventory control MSC2000 subject classification : Primary: 90C40 (Markov and semi-Markov decision processes); secondary: 90B05 (inventory, storage, reservoirs) OR/MS subject classification : Primary: dynamic programming/optimal control/Markov/in±nite state; secondary: inventory/production/uncertainty/stochastic History : Received April 27, 2006; revised September 7, 2006. 1. Introduction. In a discrete-time Markov decision process (MDP) the usual method to study the average cost criterion is to ±nd a solution to the average cost optimality equations. A policy that achieves the minimum in this system of equations is then average cost optimal. When the state and action spaces are in±nite, one may be required to replace the equations with inequalities, yet the conclusions are the same; a policy that achieves the minimum in the inequalities is average cost optimal. Schäl [ 27 ] provides two groups of general conditions that imply the existence of a solution to the average cost optimality inequalities (ACOI). The ±rst group, referred to as Assumptions (W) in Schäl [ 27 ], requires weak continuity of the transition probabilities. The second group, Assumptions (S) , requires setwise continuity of the transition probabilities. In either case, for each state a compact action set was assumed in Schäl [ 27 ]. The purpose of this paper is to adapt Schäl’s [ 27 ] conditions to problems with noncompact action sets, in particular to those related to inventory control. As was
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 12/06/2011 for the course MATH 101 taught by Professor Eugenea.feinberg during the Fall '11 term at State University of New York.

Page1 / 15


This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online