Loading…
Markov control models with unknown random state–action-dependent discount factors
The paper deals with a class of discounted discrete-time Markov control models with non-constant discount factors of the form α ~ ( x n , a n , ξ n + 1 ) , where x n , a n , and ξ n + 1 are the state, the action, and a random disturbance at time n , respectively, taking values in Borel spaces. Assum...
Saved in:
Published in: | TOP 2015-10, Vol.23 (3), p.743-772 |
---|---|
Main Author: | |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The paper deals with a class of discounted discrete-time Markov control models with non-constant discount factors of the form
α
~
(
x
n
,
a
n
,
ξ
n
+
1
)
, where
x
n
,
a
n
,
and
ξ
n
+
1
are the state, the action, and a random disturbance at time
n
,
respectively, taking values in Borel spaces. Assuming that the one-stage cost is possibly unbounded and that the distributions of
ξ
n
are unknown, we study the corresponding optimal control problem under two settings. Firstly we assume that the random disturbance process
ξ
n
is formed by observable independent and identically distributed random variables, and then we introduce an estimation and control procedure to construct strategies. Instead, in the second one,
ξ
n
is assumed to be non-observable whose distributions may change from stage to stage, and in this case the problem is studied as a minimax control problem in which the controller has an opponent selecting the distribution of the corresponding random disturbance at each stage. |
---|---|
ISSN: | 1134-5764 1863-8279 |
DOI: | 10.1007/s11750-015-0360-5 |