Loading…

Stochastic Integrated Actor-Critic for Deep Reinforcement Learning

We propose a deep stochastic actor-critic algorithm with an integrated network architecture and fewer parameters. We address stabilization of the learning procedure via an adaptive objective to the critic's loss and a smaller learning rate for the shared parameters between the actor and the cri...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transaction on neural networks and learning systems 2024-05, Vol.35 (5), p.6654-6666
Main Authors: Zheng, Jiaohao, Kurt, Mehmet Necip, Wang, Xiaodong
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:We propose a deep stochastic actor-critic algorithm with an integrated network architecture and fewer parameters. We address stabilization of the learning procedure via an adaptive objective to the critic's loss and a smaller learning rate for the shared parameters between the actor and the critic. Moreover, we propose a mixed on-off policy exploration strategy to speed up learning. Experiments illustrate that our algorithm reduces the sample complexity by 50%-93% compared with the state-of-the-art deep reinforcement learning (RL) algorithms twin delayed deep deterministic policy gradient (TD3), soft actor-critic (SAC), proximal policy optimization (PPO), advantage actor-critic (A2C), and interpolated policy gradient (IPG) over continuous control tasks LunarLander, BipedalWalker, BipedalWalkerHardCore, Ant, and Minitaur in the OpenAI Gym.
ISSN:2162-237X
2162-2388
DOI:10.1109/TNNLS.2022.3212273