Loading…
Stochastic Integrated Actor-Critic for Deep Reinforcement Learning
We propose a deep stochastic actor-critic algorithm with an integrated network architecture and fewer parameters. We address stabilization of the learning procedure via an adaptive objective to the critic's loss and a smaller learning rate for the shared parameters between the actor and the cri...
Saved in:
Published in: | IEEE transaction on neural networks and learning systems 2024-05, Vol.35 (5), p.6654-6666 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | We propose a deep stochastic actor-critic algorithm with an integrated network architecture and fewer parameters. We address stabilization of the learning procedure via an adaptive objective to the critic's loss and a smaller learning rate for the shared parameters between the actor and the critic. Moreover, we propose a mixed on-off policy exploration strategy to speed up learning. Experiments illustrate that our algorithm reduces the sample complexity by 50%-93% compared with the state-of-the-art deep reinforcement learning (RL) algorithms twin delayed deep deterministic policy gradient (TD3), soft actor-critic (SAC), proximal policy optimization (PPO), advantage actor-critic (A2C), and interpolated policy gradient (IPG) over continuous control tasks LunarLander, BipedalWalker, BipedalWalkerHardCore, Ant, and Minitaur in the OpenAI Gym. |
---|---|
ISSN: | 2162-237X 2162-2388 |
DOI: | 10.1109/TNNLS.2022.3212273 |