Hybrid Surrogate Assisted Evolutionary Multiobjective Reinforcement Learning for Continuous Robot Control

Atanu Mazumdar*, Ville Kyrki

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

2 Citations (Scopus)

Abstract

Many real world reinforcement learning (RL) problems consist of multiple conflicting objective functions that need to be optimized simultaneously. Finding these optimal policies (known as Pareto optimal policies) for different preferences of objectives requires extensive state space exploration. Thus, obtaining a dense set of Pareto optimal policies is challenging and often reduces the sample efficiency. In this paper, we propose a hybrid multiobjective policy optimization approach for solving multiobjective reinforcement learning (MORL) problems with continuous actions. Our approach combines the faster convergence of multiobjective policy gradient (MOPG) and a surrogate assisted multiobjective evolutionary algorithm (MOEA) to produce a dense set of Pareto optimal policies. The solutions found by the MOPG algorithm are utilized to build computationally inexpensive surrogate models in the parameter space of the policies that approximate the return of policies. An MOEA is executed that utilizes the surrogates’ mean prediction and uncertainty in the prediction to find approximate optimal policies. The final solution policies are later evaluated using the simulator and stored in an archive. Tests on multiobjective continuous action RL benchmarks show that a hybrid surrogate assisted multiobjective evolutionary optimizer with robust selection criterion produces a dense set of Pareto optimal policies without extensively exploring the state space. We also apply the proposed approach to train Pareto optimal agents for autonomous driving, where the hybrid approach produced superior results compared to a state-of-the-art MOPG algorithm.

Original languageEnglish
Title of host publicationApplications of Evolutionary Computation - 27th European Conference, EvoApplications 2024, Held as Part of EvoStar 2024, Proceedings
EditorsStephen Smith, João Correia, Christian Cintrano
PublisherSpringer
Pages61-75
Number of pages15
ISBN (Print)9783031568541
DOIs
Publication statusPublished - 2024
MoE publication typeA4 Conference publication
EventEuropean Conference on Applications of Evolutionary Computation - Aberystwyth, United Kingdom
Duration: 3 Apr 20245 Apr 2024
Conference number: 27

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14635 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceEuropean Conference on Applications of Evolutionary Computation
Abbreviated titleEvoApplications
Country/TerritoryUnited Kingdom
CityAberystwyth
Period03/04/202405/04/2024

Keywords

  • multiobjective evolutionary optimization
  • multiobjective policy gradient
  • Multiobjective reinforcement learning
  • Pareto front
  • surrogate assisted optimization

Fingerprint

Dive into the research topics of 'Hybrid Surrogate Assisted Evolutionary Multiobjective Reinforcement Learning for Continuous Robot Control'. Together they form a unique fingerprint.

Cite this