Loading…

A Modified Convergence DDPG Algorithm for Robotic Manipulation

Today, robotic arms are widely used in industry. Reinforcement learning algorithms are used frequently for controlling robotic arms in complex environments. One of the customs off-policy model-free actor-critic deep reinforcement learning for continuous action spaces is deep deterministic policy gra...

Full description

Saved in:

Bibliographic Details
Published in:	Neural processing letters 2023-12, Vol.55 (8), p.11637-11652
Main Authors:	Afzali, Seyed Reza, Shoaran, Maryam, Karimian, Ghader
Format:	Article
Language:	English
Subjects:	Algorithms Artificial Intelligence Complex Systems Computational Intelligence Computer Science Control algorithms Convergence Deep learning Degrees of freedom Divergence Machine learning Neural networks Partial differential equations Performance evaluation Robot arms Robot control Robotics Robots Robust control Simulation Stability analysis Task complexity
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Today, robotic arms are widely used in industry. Reinforcement learning algorithms are used frequently for controlling robotic arms in complex environments. One of the customs off-policy model-free actor-critic deep reinforcement learning for continuous action spaces is deep deterministic policy gradient (DDPG). This algorithm has achieved significant results when applied to control robotic arms with high degrees of freedom. But, it also has limitations. DDPG is prone to instability and divergence in complex tasks due to the high dimensional continuous action spaces. In this paper, in order to increase the reliability and convergence speed of the DDPG algorithm, a new modified convergence DDPG (MCDDPG) algorithm is presented. By saving and reusing desirable parameters of the previous actor and critic networks, the proposed algorithm has shown a significant enhancement in training time and stability of the model compared to the conventional DDPG. We evaluate our method on the PR2’s right arm which is a 7-DoF manipulator, and simulations demonstrate that our MCDDPG outperforms state-of-the-art algorithms such as DDPG and normalized advantage function in learning complex robotic tasks.
ISSN:	1370-4621 1573-773X
DOI:	10.1007/s11063-023-11393-z