Loading…
A note on the existence of optimal stationary policies for average Markov decision processes with countable states
In many practical stochastic dynamic optimization problems with countable states, the optimal policy possesses certain structural properties. For example, the (s,S) policy in inventory control, the well-known cμ-rule and the recently discovered c/μ-rule (Xia et al. (2022)) in scheduling of queues. A...
Saved in:
Published in: | Automatica (Oxford) 2023-05, Vol.151, p.110877, Article 110877 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In many practical stochastic dynamic optimization problems with countable states, the optimal policy possesses certain structural properties. For example, the (s,S) policy in inventory control, the well-known cμ-rule and the recently discovered c/μ-rule (Xia et al. (2022)) in scheduling of queues. A presumption of such results is that an optimal stationary policy exists. There are many research works regarding to the existence of optimal stationary policies of Markov decision processes with countable state spaces (see, e.g., Bertsekas (2012); Hernández-Lerma and Lasserre (1996); Puterman (1994); Sennott (1999)). However, these conditions are usually not easy to verify in such optimization problems. In this paper, we study the optimization of long-run average of continuous-time Markov decision processes with countable state spaces. We provide an intuitive approach to prove the existence of an optimal stationary policy. The approach is simply based on compactness of the policy space, with a special designed metric, and the continuity of the long-run average in the space. Our method is capable to handle cost functions unbounded from both above and below, which makes a complementary contribution to the literature work where the cost function is unbounded from only one side. Examples are provided to illustrate the application of our main results. |
---|---|
ISSN: | 0005-1098 1873-2836 |
DOI: | 10.1016/j.automatica.2023.110877 |