Loading…

Optimizing Local Satisfaction of Long-Run Average Objectives in Markov Decision Processes

Long-run average optimization problems for Markov decision processes (MDPs) require constructing policies with optimal steady-state behavior, i.e., optimal limit frequency of visits to the states. However, such policies may suffer from local instability in the sense that the frequency of states visi...

Full description

Saved in:
Bibliographic Details
Main Authors: Klaška, David, Kučera, Antonín, Kůr, Vojtěch, Musil, Vít, Řehák, Vojtěch
Format: Conference Proceeding
Language:English
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Long-run average optimization problems for Markov decision processes (MDPs) require constructing policies with optimal steady-state behavior, i.e., optimal limit frequency of visits to the states. However, such policies may suffer from local instability in the sense that the frequency of states visited in a bounded time horizon along a run differs significantly from the limit frequency. In this work, we propose an efficient algorithmic solution to this problem.
ISSN:2159-5399
2374-3468
DOI:10.1609/aaai.v38i18.29993