Meta-Learning for Multi-objective Reinforcement Learning

Xi Chen*, Ali Ghadirzadeh, Mårten Björkman, Patric Jensfelt

*Tämän työn vastaava kirjoittaja

Tutkimustuotos: Artikkeli kirjassa/konferenssijulkaisussaConference article in proceedingsScientificvertaisarvioitu

Abstrakti

Multi-objective reinforcement learning (MORL) is the generalization of standard reinforcement learning (RL) approaches to solve sequential decision making problems that consist of several, possibly conflicting, objectives. Generally, in such formulations, there is no single optimal policy which optimizes all the objectives simultaneously, and instead, a number of policies has to be found each optimizing a preference of the objectives. In this paper, we introduce a novel MORL approach by training a meta-policy, a policy simultaneously trained with multiple tasks sampled from a task distribution, for a number of randomly sampled Markov decision processes (MDPs). In other words, the MORL is framed as a meta-learning problem, with the task distribution given by a distribution over the preferences. We demonstrate that such a formulation results in a better approximation of the Pareto optimal solutions in terms of both the optimality and the computational efficiency. We evaluated our method on obtaining Pareto optimal policies using a number of continuous control problems with high degrees of freedom.

AlkuperäiskieliEnglanti
OtsikkoProceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2019
KustantajaIEEE
Sivut977-983
Sivumäärä7
ISBN (elektroninen)978-1-7281-4004-9
DOI - pysyväislinkit
TilaJulkaistu - 2019
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisussa
TapahtumaIEEE/RSJ International Conference on Intelligent Robots and Systems - The Venetian Macao, Macau, Kiina
Kesto: 4 marrask. 20198 marrask. 2019
https://www.iros2019.org/

Julkaisusarja

NimiProceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems
ISSN (painettu)2153-0858
ISSN (elektroninen)2153-0866

Conference

ConferenceIEEE/RSJ International Conference on Intelligent Robots and Systems
LyhennettäIROS
Maa/AlueKiina
KaupunkiMacau
Ajanjakso04/11/201908/11/2019
www-osoite

Sormenjälki

Sukella tutkimusaiheisiin 'Meta-Learning for Multi-objective Reinforcement Learning'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä