Skip to main navigation Skip to search Skip to main content

Meta-Learning for Multi-objective Reinforcement Learning

  • Xi Chen*
  • , Ali Ghadirzadeh
  • , Mårten Björkman
  • , Patric Jensfelt
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

37 Citations (Web of Science)

Abstract

Multi-objective reinforcement learning (MORL) is the generalization of standard reinforcement learning (RL) approaches to solve sequential decision making problems that consist of several, possibly conflicting, objectives. Generally, in such formulations, there is no single optimal policy which optimizes all the objectives simultaneously, and instead, a number of policies has to be found each optimizing a preference of the objectives. In this paper, we introduce a novel MORL approach by training a meta-policy, a policy simultaneously trained with multiple tasks sampled from a task distribution, for a number of randomly sampled Markov decision processes (MDPs). In other words, the MORL is framed as a meta-learning problem, with the task distribution given by a distribution over the preferences. We demonstrate that such a formulation results in a better approximation of the Pareto optimal solutions in terms of both the optimality and the computational efficiency. We evaluated our method on obtaining Pareto optimal policies using a number of continuous control problems with high degrees of freedom.

Original languageEnglish
Title of host publicationProceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2019
PublisherIEEE
Pages977-983
Number of pages7
ISBN (Electronic)978-1-7281-4004-9
DOIs
Publication statusPublished - 2019
MoE publication typeA4 Conference publication
EventIEEE/RSJ International Conference on Intelligent Robots and Systems - The Venetian Macao, Macau, China
Duration: 4 Nov 20198 Nov 2019
https://www.iros2019.org/

Publication series

NameProceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems
ISSN (Print)2153-0858
ISSN (Electronic)2153-0866

Conference

ConferenceIEEE/RSJ International Conference on Intelligent Robots and Systems
Abbreviated titleIROS
Country/TerritoryChina
CityMacau
Period04/11/201908/11/2019
Internet address

Funding

This work is supported by the European Unions Horizon 2020 research and innovation program, the CENTAURO project (under grant agreement No. 644839), the socSMCs project (H2020-FETPROACT-2014), and also by the Academy of Finland through the DEEPEN project.

Keywords

  • Decision making
  • Learning
  • Markov processes
  • Pareto optimisation

Fingerprint

Dive into the research topics of 'Meta-Learning for Multi-objective Reinforcement Learning'. Together they form a unique fingerprint.
  • Deep reinforcement learning for physical agents

    Kyrki, V. (Principal investigator), Yang, Y. (Project Member), Hazara, M. (Project Member), Arndt, K. (Project Member), Ghadirzadeh, A. (Project Member), Hämäläinen, A. (Project Member) & Struckmeier, O. (Project Member)

    01/01/201831/12/2019

    Project: Academy of Finland: Other research funding

Cite this