Abstrakti
Hierarchical reinforcement learning (HRL) proposes to solve difficult tasks by performing decision-making and control at successively higher levels of temporal abstraction. However, off-policy HRL often suffers from the problem of a non-stationary high-level policy since the low-level policy is constantly changing. In this paper, we propose a novel HRL approach for mitigating the non-stationarity by adversarially enforcing the high-level policy to generate subgoals compatible with the current instantiation of the low-level policy. In practice, the adversarial learning is implemented by training a simple state conditioned discriminator network concurrently with the high-level policy which determines the compatibility level of subgoals. Comparison to state-of-the-art algorithms shows that our approach improves both learning efficiency and performance in challenging continuous control tasks.
Alkuperäiskieli | Englanti |
---|---|
Otsikko | AAAI-23 Technical Tracks 8 |
Toimittajat | Brian Williams, Yiling Chen, Jennifer Neville |
Kustantaja | AAAI Press |
Sivut | 10184-10191 |
Sivumäärä | 8 |
ISBN (elektroninen) | 978-1-57735-880-0 |
DOI - pysyväislinkit | |
Tila | Julkaistu - 26 kesäk. 2023 |
OKM-julkaisutyyppi | A4 Artikkeli konferenssijulkaisussa |
Tapahtuma | AAAI Conference on Artificial Intelligence - Walter E. Washington Convention Center, Washington, Yhdysvallat Kesto: 7 helmik. 2023 → 14 helmik. 2023 Konferenssinumero: 37 https://aaai-23.aaai.org/ |
Julkaisusarja
Nimi | Proceedings of the AAAI Conference on Artificial Intelligence |
---|---|
Vuosikerta | 37 |
ISSN (elektroninen) | 2374-3468 |
Conference
Conference | AAAI Conference on Artificial Intelligence |
---|---|
Lyhennettä | AAAI |
Maa/Alue | Yhdysvallat |
Kaupunki | Washington |
Ajanjakso | 07/02/2023 → 14/02/2023 |
www-osoite |