Loading…

Timesharing-tracking Framework for Decentralized Reinforcement Learning in Fully Cooperative Multi-agent System

Dimension-reduced and decentralized learning i always viewed as an efficient way to solve multi-agent cooperative learning in high dimension. However, the dynamic environmen brought by the concurrent learning makes the decentralized learning hard to converge and bad in performance. To tackle thi pro...

Full description

Saved in:
Bibliographic Details
Published in:IEEE/CAA journal of automatica sinica 2014-04, Vol.1 (2), p.127-133
Main Authors: Chen, Xin, Fu, Bo, He, Yong, Wu, Min
Format: Article
Language:English
Subjects:
Citations: Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Dimension-reduced and decentralized learning i always viewed as an efficient way to solve multi-agent cooperative learning in high dimension. However, the dynamic environmen brought by the concurrent learning makes the decentralized learning hard to converge and bad in performance. To tackle thi problem, a timesharing-tracking framework(TTF), stemming from the idea that alternative learning in microscopic view results in concurrent learning in macroscopic view, is proposed in this paper, in which the joint-state best-response Q-learning(BRQ-learning) serves as the primary algorithm to adapt to the companions policies. With the properly defined switching principle, TTF makes all agents learn the best responses to others at different joint states. Thus from the view of the whole joint-state space, agents learn the optimal cooperative policy simultaneously. The simulation results illustrate that the proposed algorithm can learn the optimal joint behavior with les computation and faster speed compared with other two classica learning algorithms.
ISSN:2329-9266
2329-9274
DOI:10.1109/JAS.2014.7004541