Hierarchical Imitation Learning with Vector Quantized Models

Kalle Kujanpää, Joni Pajarinen, Alexander Ilin

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

1 Citation (Scopus)
75 Downloads (Pure)

Abstract

The ability to plan actions on multiple levels of abstraction enables intelligent agents to solve complex tasks effectively. However, learning the models for both low and high-level planning from demonstrations has proven challenging, especially with higher-dimensional inputs. To address this issue, we propose to use reinforcement learning to identify subgoals in expert trajectories by associating the magnitude of the rewards with the predictability of low-level actions given the state and the chosen subgoal. We build a vector-quantized generative model for the identified subgoals to perform subgoal-level planning. In experiments, the algorithm excels at solving complex, long-horizon decision-making problems outperforming state-of-the-art. Because of its ability to plan, our algorithm can find better trajectories than the ones in the training set.
Original languageEnglish
Title of host publicationProceedings of the 40th International Conference on Machine Learning
EditorsAndread Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, Jonathan Scarlett
PublisherJMLR
Pages17896-17919
Number of pages24
Publication statusPublished - Jul 2023
MoE publication typeA4 Conference publication
EventInternational Conference on Machine Learning - Honolulu, United States
Duration: 23 Jul 202329 Jul 2023
Conference number: 40

Publication series

NameProceedings of Machine Learning Research
PublisherPMLR
Volume202
ISSN (Electronic)2640-3498

Conference

ConferenceInternational Conference on Machine Learning
Abbreviated titleICML
Country/TerritoryUnited States
CityHonolulu
Period23/07/202329/07/2023

Fingerprint

Dive into the research topics of 'Hierarchical Imitation Learning with Vector Quantized Models'. Together they form a unique fingerprint.

Cite this