Loading…

Adaptive deep reinforcement learning for non-stationary environments

Deep reinforcement learning (DRL) is currently used to solve Markov decision process problems for which the environment is typically assumed to be stationary. In this paper, we propose an adaptive DRL method for non-stationary environments. First, we introduce model uncertainty and propose the self-...

Full description

Saved in:

Bibliographic Details
Published in:	Science China. Information sciences 2022-10, Vol.65 (10), p.202204, Article 202204
Main Authors:	Zhu, Jin, Wei, Yutong, Kang, Yu, Jiang, Xiaofeng, Dullerud, Geir E.
Format:	Article
Language:	English
Subjects:	Algorithms Computer Science Deep learning Exploitation Information Systems and Communication Service Machine learning Markov analysis Markov processes Neural networks Nonstationary environments Regularization methods Research Paper Restrictions
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Deep reinforcement learning (DRL) is currently used to solve Markov decision process problems for which the environment is typically assumed to be stationary. In this paper, we propose an adaptive DRL method for non-stationary environments. First, we introduce model uncertainty and propose the self-adjusting deep Q -learning algorithm, which can achieve the rebalance of exploration and exploitation automatically as the environment changes. Second, we propose a feasible criterion to judge the appropriateness of parameter setting of deep Q -networks and minimize the misjudgment probability based on the large deviation principle (LDP). The effectiveness of the proposed adaptive DRL method is illustrated in terms of an advanced persistent threat (APT) attack simulation game. Experimental results show that compared with the classic deep Q -learning algorithms in non-stationary and stationary environments, the adaptive DRL method improves performance by at least 14.28% and 30.56%, respectively.
ISSN:	1674-733X 1869-1919
DOI:	10.1007/s11432-021-3347-8