Simplified Temporal Consistency Reinforcement Learning

Tutkimustuotos: Artikkeli kirjassa/konferenssijulkaisussaConference article in proceedingsScientificvertaisarvioitu

16 Lataukset (Pure)

Abstrakti

Reinforcement learning (RL) is able to solve complex sequential decision-making tasks but is currently limited by sample efficiency and required computation. To improve sample efficiency, recent work focuses on model-based RL which interleaves model learning with planning. Recent methods further utilize policy learning, value estimation, and, self-supervised learning as auxiliary objectives. In this paper we show that, surprisingly, a simple representation learning approach relying only on a latent dynamics model trained by latent temporal consistency is sufficient for high-performance RL. This applies when using pure planning with a dynamics model conditioned on the representation, but, also when utilizing the representation as policy and value function features in model-free RL. In experiments, our approach learns an accurate dynamics model to solve challenging high-dimensional locomotion tasks with online planners while being 4.1× faster to train compared to ensemble-based methods. With model-free RL without planning, especially on high-dimensional tasks, such as the Deepmind Control Suite Humanoid and Dog tasks, our approach outperforms model-free methods by a large margin and matches model-based methods’ sample efficiency while training 2.4× faster.
AlkuperäiskieliEnglanti
OtsikkoProceedings of the 40th International Conference on Machine Learning
ToimittajatAndread Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, Jonathan Scarlett
KustantajaJMLR
Sivut42227-42246
Sivumäärä20
TilaJulkaistu - heinäk. 2023
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisussa
TapahtumaInternational Conference on Machine Learning - Honolulu, Yhdysvallat
Kesto: 23 heinäk. 202329 heinäk. 2023
Konferenssinumero: 40

Julkaisusarja

NimiProceedings of Machine Learning Research
KustantajaPMLR
Vuosikerta202
ISSN (elektroninen)2640-3498

Conference

ConferenceInternational Conference on Machine Learning
LyhennettäICML
Maa/AlueYhdysvallat
KaupunkiHonolulu
Ajanjakso23/07/202329/07/2023

Sormenjälki

Sukella tutkimusaiheisiin 'Simplified Temporal Consistency Reinforcement Learning'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä