Projects per year
Abstract
Reinforcement learning (RL) is able to solve complex sequential decision-making tasks but is currently limited by sample efficiency and required computation. To improve sample efficiency, recent work focuses on model-based RL which interleaves model learning with planning. Recent methods further utilize policy learning, value estimation, and, self-supervised learning as auxiliary objectives. In this paper we show that, surprisingly, a simple representation learning approach relying only on a latent dynamics model trained by latent temporal consistency is sufficient for high-performance RL. This applies when using pure planning with a dynamics model conditioned on the representation, but, also when utilizing the representation as policy and value function features in model-free RL. In experiments, our approach learns an accurate dynamics model to solve challenging high-dimensional locomotion tasks with online planners while being 4.1× faster to train compared to ensemble-based methods. With model-free RL without planning, especially on high-dimensional tasks, such as the Deepmind Control Suite Humanoid and Dog tasks, our approach outperforms model-free methods by a large margin and matches model-based methods’ sample efficiency while training 2.4× faster.
Original language | English |
---|---|
Title of host publication | Proceedings of the 40th International Conference on Machine Learning |
Editors | Andread Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, Jonathan Scarlett |
Publisher | JMLR |
Pages | 42227-42246 |
Number of pages | 20 |
Publication status | Published - Jul 2023 |
MoE publication type | A4 Conference publication |
Event | International Conference on Machine Learning - Honolulu, United States Duration: 23 Jul 2023 → 29 Jul 2023 Conference number: 40 |
Publication series
Name | Proceedings of Machine Learning Research |
---|---|
Publisher | PMLR |
Volume | 202 |
ISSN (Electronic) | 2640-3498 |
Conference
Conference | International Conference on Machine Learning |
---|---|
Abbreviated title | ICML |
Country/Territory | United States |
City | Honolulu |
Period | 23/07/2023 → 29/07/2023 |
Fingerprint
Dive into the research topics of 'Simplified Temporal Consistency Reinforcement Learning'. Together they form a unique fingerprint.-
Safe: Turvallinen vahvistusoppiminen epästationaarisissa ympäristöissä nopealla sopeutumisella ja häiriöennustuksella
01/01/2022 → 31/12/2024
Project: Academy of Finland: Other research funding
-
-: Finnish Center for Artificial Intelligence
01/01/2019 → 31/12/2022
Project: Academy of Finland: Other research funding