Optimality Inequalities for Average Cost MDPs
and their Inventory Control Applications
Eugene A. Feinberg
Mark E. Lewis
Abstract
— We show that the assumptions guaranteeing ex
istence of a solution to the average cost optimality inequalities
presented in Sch¨al [1] for compact action sets can be extended
to include the noncompact action set case. Inventory and
stochastic cash balance models (with Fxed costs) are natural
candidates for the application of our results. Extension of the
classic models for a general demand distribution are discussed
in detail.
I. INTRODUCTION
In a discretetime Markov decision process the usual
method to study the average cost criterion is to ±nd a
solution to the average cost optimality equations. A policy
that achieves the minimum in this system of equations is
then average cost optimal. When the state and action spaces
are in±nite, one may be required to replace the equations
with inequalities, yet the conclusions are the same; a policy
that achieves the minimum in the inequalities is average cost
optimal. Sch¨al [1] provides two sets of general conditions
that imply the existence of a solution to the average cost
optimality
inequalities
(ACOI). The ±rst, referred to as
Assumptions
(W)
in Sch¨al [1], require weak continuity of
the transition probabilities. The second group, Assumptions
(S)
, require setwise continuity of the transition probabilities.
The purpose of this paper is to relax the assumptions in
Sch¨al [1] so that the results can be applied directly to several
problems in the literature; in particular to those related to
inventory control.
Recall the typical dynamic state equation for inventory
control models
x
n
+1
=
x
n
+
a
n
−
D
n
+1
,n
=0
,
1
,
2
,...,
(I.1)
where
x
n
is the inventory at the end of period
n
,
a
n
is
the decision how much should be ordered, and
D
n
is the
demand during period
n
. Let
q
(
dy

x, a
)
be the transition
probability for the control problem (I.1). Weak continuity
of
q
means that
E
x
k
n
,a
k
n
f
(
x
n
+1
)
→
E
x
n
,a
n
f
(
x
n
+1
)
for
any sequence
{
(
x
k
n
,a
k
n
)
,k
≥
0
}
such that
(
x
k
n
k
n
)
→
Feinberg: Department of Applied Mathematics and Statistics, State
University of New York at Stony Brook, Stony Brook, NY 117943600
(efeinberg@notes.cc.sunysb.edu)
Lewis: Department of Industrial and Operations Engineering, Uni
versity of Michigan, 1205 Beal Avenue, Ann Arbor, MI 481092117
(melewis@engin.umich.edu)
(
x
n
n
)
where
f
is any bounded, continuous function and
the expectation is taken with respect to
q
. This holds in
most inventory applications in light of (I.1) and Lebesgue’s
dominated convergence theorem. On the other hand, set
wise continuity is too strong. Recall that this means that
q
(
B

x
k
n
k
n
)
→
q
(
B

x
n
n
)
as
(
x
k
n
k
n
)
→
(
x
n
n
)
for
any Borel set
B
. For example, let
D
n
=1
(deterministi
cally),
a
k
n
=
a
n
+
1
k
and
x
k
n
=
x
n
.
Then
q
(
B

x
n
n
)=1
for
B
=(
−∞
,x
n
+
a
n
−
1]
and
q
(
B

x
n
k
n
)=0
for all
k
,
2
,...