Meta Reinforcement Learning for Sim-to-real Domain Adaptation

Karol Arndt, Murtaza Hazara, Ali Ghadirzadeh, Ville Kyrki

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

4 Citations (Scopus)

Abstract

Modern reinforcement learning methods suffer from low sample efficiency and unsafe exploration, making it infeasible to train robotic policies entirely on real hardware. In this work, we propose to address the problem of sim-to-real domain transfer by using meta learning to train a policy that can adapt to a variety of dynamic conditions, and using a task-specific trajectory generation model to provide an action space that facilitates quick exploration. We evaluate the method by performing domain adaptation in simulation and analyzing the structure of the latent space during adaptation. We then deploy this policy on a KUKA LBR 4+ robot and evaluate its performance on a task of hitting a hockey puck to a target. Our method shows more consistent and stable domain adaptation than the baseline, resulting in better overall performance.
Original languageEnglish
Title of host publicationProceedings of the IEEE Conference on Robotics and Automation, ICRA 2020
PublisherIEEE
Pages2725-2731
Number of pages7
ISBN (Electronic)978-1-7281-7395-5
DOIs
Publication statusPublished - 2020
MoE publication typeA4 Article in a conference publication
EventIEEE International Conference on Robotics and Automation - Online
Duration: 31 May 202031 Aug 2020

Publication series

NameIEEE International Conference on Robotics and Automation
PublisherIEEE
ISSN (Print)2152-4092
ISSN (Electronic)2379-9552

Conference

ConferenceIEEE International Conference on Robotics and Automation
Abbreviated titleICRA
Period31/05/202031/08/2020

Keywords

  • Control engineering computing
  • Learning (artificial intelligence)
  • Manipulators
  • Neural nets
  • Transfer learning

Fingerprint

Dive into the research topics of 'Meta Reinforcement Learning for Sim-to-real Domain Adaptation'. Together they form a unique fingerprint.

Cite this