Loading…

A Robust Approach for Continuous Interactive Actor-Critic Algorithms

Reinforcement learning refers to a machine learning paradigm in which an agent interacts with the environment to learn how to perform a task. The characteristics of the environment may change over time or be affected by disturbances not controlled, avoiding the agent finding a proper policy. Some ap...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE access 2021, Vol.9, p.104242-104260
Main Authors:	Millan-Arias, Cristian C., Fernandes, Bruno J. T., Cruz, Francisco, Dazeley, Richard, Fernandes, Sergio
Format:	Article
Language:	English
Subjects:	Algorithms Approximation algorithms Continuous interactive reinforcement learning interactive robust reinforcement learning Machine learning Manipulators Organisms Proposals Reinforcement learning Robot arms robust reinforcement learning Robustness Task analysis Trajectory optimization
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Reinforcement learning refers to a machine learning paradigm in which an agent interacts with the environment to learn how to perform a task. The characteristics of the environment may change over time or be affected by disturbances not controlled, avoiding the agent finding a proper policy. Some approaches attempt to address these problems, as interactive reinforcement learning, where an external entity helps the agent learn through advice. Other approaches, such as robust reinforcement learning, allow the agent to learn the task, acting in a disturbed environment. In this paper, we propose an approach that addresses interactive reinforcement learning problems in a dynamic environment, where advice provides information on the task and the dynamics of the environment. Thus, an agent learns a policy in a disturbed environment while receiving advice. We implement our approach in the dynamic version of the cart-pole balancing task and a simulated robotic arm dynamic environment to organize objects. Our results show that the proposed approach allows an agent to complete the task satisfactorily in a dynamic, continuous state-action domain. Moreover, experimental results suggest agents trained with our approach are less sensitive to changes in the characteristics of the environment than interactive reinforcement learning agents.
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2021.3099071