Residual Learning and Context Encoding for Adaptive Offline-to-Online Reinforcement Learning

Research output: Contribution to journalConference articleScientificpeer-review

14 Downloads (Pure)

Abstract

Offline reinforcement learning (RL) allows learning sequential behavior from fixed datasets. Since offline datasets do not cover all possible situations, many methods collect additional data during online fine-tuning to improve performance. In general, these methods assume that the transition dynamics remain the same during both the offline and online phases of training. However, in many real-world applications, such as outdoor construction and navigation over rough terrain, it is common for the transition dynamics to vary between the offline and online phases. Moreover, the dynamics may vary during the online fine-tuning. To address this problem of changing dynamics from offline to online RL we propose a residual learning approach that infers dynamics changes to correct the outputs of the offline solution. At the online fine-tuning phase, we train a context encoder to learn a representation that is consistent inside the current online learning environment while being able to predict dynamic transitions.

Original languageEnglish
Pages (from-to)1107-1121
Number of pages15
JournalProceedings of Machine Learning Research
Volume242
Publication statusPublished - 2024
MoE publication typeA4 Conference publication
EventLearning for Dynamics and Control Conference - Oxford, United Kingdom
Duration: 15 Jul 202417 Jul 2024

Keywords

  • Adaptive RL
  • Context Encoding
  • Offline-to-Online RL

Fingerprint

Dive into the research topics of 'Residual Learning and Context Encoding for Adaptive Offline-to-Online Reinforcement Learning'. Together they form a unique fingerprint.

Cite this