Abstrakti

Model-based reinforcement learning (MBRL) approaches rely on discrete-time state transition models whereas physical systems and the vast majority of control tasks operate in continuous-time. To avoid time-discretization approximation of the underlying process, we propose a continuous-time MBRL framework based on a novel actor-critic method. Our approach also infers the unknown state evolution differentials with Bayesian neural ordinary differential equations (ODE) to account for epistemic uncertainty. We implement and test our method on a new ODE-RL suite that explicitly solves continuous-time control systems. Our experiments illustrate that the model is robust against irregular and noisy data, and can solve classic control problems in a sample-efficient manner.
AlkuperäiskieliEnglanti
OtsikkoProceedings of the 38th International Conference on Machine Learning, ICML 2021
KustantajaJMLR
Sivut12009-12018
TilaJulkaistu - 21 heinäk. 2021
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisussa
TapahtumaInternational Conference on Machine Learning - Virtual, Online
Kesto: 18 heinäk. 202124 heinäk. 2021
Konferenssinumero: 38

Julkaisusarja

NimiProceedings of Machine Learning Research
KustantajaPMLR
Vuosikerta139
ISSN (elektroninen)2640-3498

Conference

ConferenceInternational Conference on Machine Learning
LyhennettäICML
KaupunkiVirtual, Online
Ajanjakso18/07/202124/07/2021

Sormenjälki

Sukella tutkimusaiheisiin 'Continuous-time Model-based Reinforcement Learning'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä