Loading…
GoSafeOpt: Scalable safe exploration for global optimization of dynamical systems
Learning optimal control policies directly on physical systems is challenging. Even a single failure can lead to costly hardware damage. Most existing model-free learning methods that guarantee safety, i.e., no failures, during exploration are limited to local optima. This work proposes GoSafeOpt as...
Saved in:
Published in: | Artificial intelligence 2023-07, Vol.320, p.103922, Article 103922 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Learning optimal control policies directly on physical systems is challenging. Even a single failure can lead to costly hardware damage. Most existing model-free learning methods that guarantee safety, i.e., no failures, during exploration are limited to local optima. This work proposes GoSafeOpt as the first provably safe and optimal algorithm that can safely discover globally optimal policies for systems with high-dimensional state space. We demonstrate the superiority of GoSafeOpt over competing model-free safe learning methods in simulation and hardware experiments on a robot arm. |
---|---|
ISSN: | 0004-3702 1872-7921 1872-7921 |
DOI: | 10.1016/j.artint.2023.103922 |