Loading…

A note on the existence of optimal stationary policies for average Markov decision processes with countable states

In many practical stochastic dynamic optimization problems with countable states, the optimal policy possesses certain structural properties. For example, the (s,S) policy in inventory control, the well-known cμ-rule and the recently discovered c/μ-rule (Xia et al. (2022)) in scheduling of queues. A...

Full description

Saved in:
Bibliographic Details
Published in:Automatica (Oxford) 2023-05, Vol.151, p.110877, Article 110877
Main Authors: Xia, Li, Guo, Xianping, Cao, Xi-Ren
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In many practical stochastic dynamic optimization problems with countable states, the optimal policy possesses certain structural properties. For example, the (s,S) policy in inventory control, the well-known cμ-rule and the recently discovered c/μ-rule (Xia et al. (2022)) in scheduling of queues. A presumption of such results is that an optimal stationary policy exists. There are many research works regarding to the existence of optimal stationary policies of Markov decision processes with countable state spaces (see, e.g., Bertsekas (2012); Hernández-Lerma and Lasserre (1996); Puterman (1994); Sennott (1999)). However, these conditions are usually not easy to verify in such optimization problems. In this paper, we study the optimization of long-run average of continuous-time Markov decision processes with countable state spaces. We provide an intuitive approach to prove the existence of an optimal stationary policy. The approach is simply based on compactness of the policy space, with a special designed metric, and the continuity of the long-run average in the space. Our method is capable to handle cost functions unbounded from both above and below, which makes a complementary contribution to the literature work where the cost function is unbounded from only one side. Examples are provided to illustrate the application of our main results.
ISSN:0005-1098
1873-2836
DOI:10.1016/j.automatica.2023.110877