Loading…

A reinforcement learning approach to fail-safe design for multiple space robots-cooperation mechanism without communication and negotiation schemes

This paper explores a fail-safe design for multiple space robots, which enables robots to complete given tasks even when they can no longer be controlled due to a communication accident or negotiation problem. As the first step towards this goal, we propose new reinforcement learning methods that he...

Full description

Saved in:

Bibliographic Details
Published in:	Advanced robotics 2003-01, Vol.17 (1), p.21-39
Main Authors:	Takadama, Keiki, Matsumoto, Shuichi, Nakasuka, Shinichi, Shimohara, Katsunori
Format:	Article
Language:	English
Subjects:	COOPERATION WITHOUT COMMUNICATION AND NEGOTIATION DEADLOCK FAIL-SAFE DESIGN MULTIAGENT-BASED ARCHITECTURE MULTIPLE SPACE ROBOTS REINFORCEMENT LEARNING
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	This paper explores a fail-safe design for multiple space robots, which enables robots to complete given tasks even when they can no longer be controlled due to a communication accident or negotiation problem. As the first step towards this goal, we propose new reinforcement learning methods that help robots avoid deadlock situations in addition to improving the degree of task completion without communications via ground stations or negotiations with other robots. Through intensive simulations on a truss construction task, we found that our reinforcement learning methods have great potential to contribute towards fail-safe design for multiple space robots in the above case. Furthermore, the simulations revealed the following detailed implications: (i) the first several planned behaviors must not be reinforced with negative rewards even in deadlock situations in order to derive cooperation among multiple robots, (ii) a certain amount of positive rewards added into negative rewards in deadlock situations contributes to reducing the computational cost of finding behavior plans for task completion, and (iii) an appropriate balance between positive and negative rewards in deadlock situations is indispensable for finding good behavior plans at a small computational cost.
ISSN:	0169-1864 1568-5535
DOI:	10.1163/156855303321125604