Loading…
Smooth Q-Learning: An Algorithm for Independent Learners in Stochastic Cooperative Markov Games
In this article, we introduce the Smooth Q-Learning algorithm for independent learners (distributed and non-communicative learners) in cooperative Markov games. Smooth Q-Learning aimed to solve the relative over-generalization and the stochasticity problems while also performing well in the presence...
Saved in:
Published in: | Journal of intelligent & robotic systems 2023-08, Vol.108 (4), p.65, Article 65 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In this article, we introduce the Smooth Q-Learning algorithm for independent learners (distributed and non-communicative learners) in cooperative Markov games. Smooth Q-Learning aimed to solve the relative over-generalization and the stochasticity problems while also performing well in the presence of other non-coordination factors such as the miscoordination problem (also known as the Pareto selection problem) and the non-stationarity problem. Smooth Q-Learning is an algorithm that tries to find a trade-off between two incompatible learning approaches: the maximum-based learning and the average-based learning, by dynamically adjusting the learning rate based on the value of temporal difference error in a way that ensures the algorithm lies somewhere between average-based learning and maximum-based learning. We compare Smooth Q-Learning against different algorithms from the literature: Decentralized Q-learning, Distributed Q-Learning, Hysteretic Q-Learning, and a recent version of Lenient Q-Learning called Lenient Multiagent Reinforcement learning 2. The results show that Smooth Q-Learning is very effective in the sense that it has the highest number of convergent trials. Unlike competing algorithms, Smooth Q-Learning is also easy to tune and does not require storing additional information. |
---|---|
ISSN: | 0921-0296 1573-0409 |
DOI: | 10.1007/s10846-023-01917-z |