Projekteja vuodessa
Abstrakti
Offline reinforcement learning (RL) allows learning sequential behavior from fixed datasets. Since offline datasets do not cover all possible situations, many methods collect additional data during online fine-tuning to improve performance. In general, these methods assume that the transition dynamics remain the same during both the offline and online phases of training. However, in many real-world applications, such as outdoor construction and navigation over rough terrain, it is common for the transition dynamics to vary between the offline and online phases. Moreover, the dynamics may vary during the online fine-tuning. To address this problem of changing dynamics from offline to online RL we propose a residual learning approach that infers dynamics changes to correct the outputs of the offline solution. At the online fine-tuning phase, we train a context encoder to learn a representation that is consistent inside the current online learning environment while being able to predict dynamic transitions.
Alkuperäiskieli | Englanti |
---|---|
Sivut | 1107-1121 |
Sivumäärä | 15 |
Julkaisu | Proceedings of Machine Learning Research |
Vuosikerta | 242 |
Tila | Julkaistu - 2024 |
OKM-julkaisutyyppi | A4 Artikkeli konferenssijulkaisussa |
Tapahtuma | Learning for Dynamics and Control Conference - Oxford, Iso-Britannia Kesto: 15 heinäk. 2024 → 17 heinäk. 2024 |
Sormenjälki
Sukella tutkimusaiheisiin 'Residual Learning and Context Encoding for Adaptive Offline-to-Online Reinforcement Learning'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.-
Safe: Turvallinen vahvistusoppiminen epästationaarisissa ympäristöissä nopealla sopeutumisella ja häiriöennustuksella
Pajarinen, J. (Vastuullinen tutkija), Kostin, N. (Projektin jäsen) & Zhao, Y. (Projektin jäsen)
01/01/2022 → 31/12/2024
Projekti: Academy of Finland: Other research funding
-
-: Finnish Center for Artificial Intelligence
Kaski, S. (Vastuullinen tutkija)
01/01/2019 → 31/12/2022
Projekti: Academy of Finland: Other research funding