Hybrid Surrogate Assisted Evolutionary Multiobjective Reinforcement Learning for Continuous Robot Control

Atanu Mazumdar*, Ville Kyrki

*Tämän työn vastaava kirjoittaja

Tutkimustuotos: Artikkeli kirjassa/konferenssijulkaisussaConference article in proceedingsScientificvertaisarvioitu

Abstrakti

Many real world reinforcement learning (RL) problems consist of multiple conflicting objective functions that need to be optimized simultaneously. Finding these optimal policies (known as Pareto optimal policies) for different preferences of objectives requires extensive state space exploration. Thus, obtaining a dense set of Pareto optimal policies is challenging and often reduces the sample efficiency. In this paper, we propose a hybrid multiobjective policy optimization approach for solving multiobjective reinforcement learning (MORL) problems with continuous actions. Our approach combines the faster convergence of multiobjective policy gradient (MOPG) and a surrogate assisted multiobjective evolutionary algorithm (MOEA) to produce a dense set of Pareto optimal policies. The solutions found by the MOPG algorithm are utilized to build computationally inexpensive surrogate models in the parameter space of the policies that approximate the return of policies. An MOEA is executed that utilizes the surrogates’ mean prediction and uncertainty in the prediction to find approximate optimal policies. The final solution policies are later evaluated using the simulator and stored in an archive. Tests on multiobjective continuous action RL benchmarks show that a hybrid surrogate assisted multiobjective evolutionary optimizer with robust selection criterion produces a dense set of Pareto optimal policies without extensively exploring the state space. We also apply the proposed approach to train Pareto optimal agents for autonomous driving, where the hybrid approach produced superior results compared to a state-of-the-art MOPG algorithm.

AlkuperäiskieliEnglanti
OtsikkoApplications of Evolutionary Computation - 27th European Conference, EvoApplications 2024, Held as Part of EvoStar 2024, Proceedings
ToimittajatStephen Smith, João Correia, Christian Cintrano
KustantajaSpringer
Sivut61-75
Sivumäärä15
ISBN (painettu)9783031568541
DOI - pysyväislinkit
TilaJulkaistu - 2024
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisussa
TapahtumaEuropean Conference on Applications of Evolutionary Computation - Aberystwyth, Iso-Britannia
Kesto: 3 huhtik. 20245 huhtik. 2024
Konferenssinumero: 27

Julkaisusarja

NimiLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Vuosikerta14635 LNCS
ISSN (painettu)0302-9743
ISSN (elektroninen)1611-3349

Conference

ConferenceEuropean Conference on Applications of Evolutionary Computation
LyhennettäEvoApplications
Maa/AlueIso-Britannia
KaupunkiAberystwyth
Ajanjakso03/04/202405/04/2024

Sormenjälki

Sukella tutkimusaiheisiin 'Hybrid Surrogate Assisted Evolutionary Multiobjective Reinforcement Learning for Continuous Robot Control'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä