Residual Learning and Context Encoding for Adaptive Offline-to-Online Reinforcement Learning

Tutkimustuotos: LehtiartikkeliConference articleScientificvertaisarvioitu

14 Lataukset (Pure)

Abstrakti

Offline reinforcement learning (RL) allows learning sequential behavior from fixed datasets. Since offline datasets do not cover all possible situations, many methods collect additional data during online fine-tuning to improve performance. In general, these methods assume that the transition dynamics remain the same during both the offline and online phases of training. However, in many real-world applications, such as outdoor construction and navigation over rough terrain, it is common for the transition dynamics to vary between the offline and online phases. Moreover, the dynamics may vary during the online fine-tuning. To address this problem of changing dynamics from offline to online RL we propose a residual learning approach that infers dynamics changes to correct the outputs of the offline solution. At the online fine-tuning phase, we train a context encoder to learn a representation that is consistent inside the current online learning environment while being able to predict dynamic transitions.

AlkuperäiskieliEnglanti
Sivut1107-1121
Sivumäärä15
JulkaisuProceedings of Machine Learning Research
Vuosikerta242
TilaJulkaistu - 2024
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisussa
TapahtumaLearning for Dynamics and Control Conference - Oxford, Iso-Britannia
Kesto: 15 heinäk. 202417 heinäk. 2024

Sormenjälki

Sukella tutkimusaiheisiin 'Residual Learning and Context Encoding for Adaptive Offline-to-Online Reinforcement Learning'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä