A Scale-Independent Multi-Objective Reinforcement Learning with Convergence Analysis

Tutkimustuotos: Artikkeli kirjassa/konferenssijulkaisussaConference article in proceedingsScientificvertaisarvioitu

Abstrakti

Many sequential decision-making problems need optimization of different objectives which possibly conflict with each other. The conventional way to deal with a multitask problem is to establish a scalar objective function based on a linear combination of different objectives. However, for the case where we have conflicting objectives with different scales, this method needs a trial-and-error approach to properly find proper weights for the combination. As such, in most cases, this approach cannot guarantee an optimal Pareto solution. In this paper, we develop a single-agent scale-independent multi-objective reinforcement learning on the basis of the Advantage Actor-Critic (A2C) algorithm. A convergence analysis is then done for the devised multi-objective algorithm providing a convergence-in-mean guarantee. We then perform some experiments over a multitask problem to evaluate the performance of the proposed algorithm. Simulation results show the superiority of developed multi-objective A2C approach against the single-objective algorithm.
AlkuperäiskieliEnglanti
Otsikko2023 62nd IEEE Conference on Decision and Control (CDC)
KustantajaIEEE
Sivut1326-1333
Sivumäärä8
ISBN (painettu)979-8-3503-0125-0
DOI - pysyväislinkit
TilaJulkaistu - 15 jouluk. 2023
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisussa
TapahtumaIEEE Conference on Decision and Control - Marina Bay Sands, Singapore, Singapore
Kesto: 13 jouluk. 202315 jouluk. 2023
Konferenssinumero: 62
https://cdc2023.ieeecss.org/

Julkaisusarja

NimiProceedings of the IEEE Conference on Decision & Control
ISSN (elektroninen)2576-2370

Conference

ConferenceIEEE Conference on Decision and Control
LyhennettäCDC
Maa/AlueSingapore
KaupunkiSingapore
Ajanjakso13/12/202315/12/2023
www-osoite

Sormenjälki

Sukella tutkimusaiheisiin 'A Scale-Independent Multi-Objective Reinforcement Learning with Convergence Analysis'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä