Loading…

GoSafeOpt: Scalable safe exploration for global optimization of dynamical systems

Learning optimal control policies directly on physical systems is challenging. Even a single failure can lead to costly hardware damage. Most existing model-free learning methods that guarantee safety, i.e., no failures, during exploration are limited to local optima. This work proposes GoSafeOpt as...

Full description

Saved in:
Bibliographic Details
Published in:Artificial intelligence 2023-07, Vol.320, p.103922, Article 103922
Main Authors: Sukhija, Bhavya, Turchetta, Matteo, Lindner, David, Krause, Andreas, Trimpe, Sebastian, Baumann, Dominik
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Learning optimal control policies directly on physical systems is challenging. Even a single failure can lead to costly hardware damage. Most existing model-free learning methods that guarantee safety, i.e., no failures, during exploration are limited to local optima. This work proposes GoSafeOpt as the first provably safe and optimal algorithm that can safely discover globally optimal policies for systems with high-dimensional state space. We demonstrate the superiority of GoSafeOpt over competing model-free safe learning methods in simulation and hardware experiments on a robot arm.
ISSN:0004-3702
1872-7921
1872-7921
DOI:10.1016/j.artint.2023.103922