TY - JOUR
T1 - GoSafeOpt: Scalable Safe Exploration for Global Optimization of Dynamical Systems
AU - Sukhija, Bhavya
AU - Turchetta, Matteo
AU - Lindner, David
AU - Krause, Andreas
AU - Trimpe, Sebastian
AU - Baumann, Dominik
PY - 2023/7
Y1 - 2023/7
N2 - Learning optimal control policies directly on physical systems is challenging. Even a single failure can lead to costly hardware damage. Most existing model-free learning methods that guarantee safety, i.e., no failures, during exploration are limited to local optima. This work proposes GOSAFEOPT as the first provably safe and optimal algorithm that can safely discover globally optimal policies for systems with high-dimensional state space. We demonstrate the superiority of GOSAFEOPT over competing model-free safe learning methods in simulation and hardware experiments on a robot arm.
AB - Learning optimal control policies directly on physical systems is challenging. Even a single failure can lead to costly hardware damage. Most existing model-free learning methods that guarantee safety, i.e., no failures, during exploration are limited to local optima. This work proposes GOSAFEOPT as the first provably safe and optimal algorithm that can safely discover globally optimal policies for systems with high-dimensional state space. We demonstrate the superiority of GOSAFEOPT over competing model-free safe learning methods in simulation and hardware experiments on a robot arm.
KW - Computer Science - Machine Learning
KW - Electrical Engineering and Systems Science - Systems and Control
UR - http://adsabs.harvard.edu/abs/2022arXiv220109562S
UR - http://www.scopus.com/inward/record.url?scp=85153578129&partnerID=8YFLogxK
U2 - 10.1016/j.artint.2023.103922
DO - 10.1016/j.artint.2023.103922
M3 - Article
SN - 0004-3702
VL - 320
JO - Artificial Intelligence
JF - Artificial Intelligence
M1 - 103922
ER -