Projects per year
Abstract
Inverse reinforcement learning (IRL) methods learn a reward function from expert demonstrations such as human behavior, offering a practical solution for crafting reward functions for complex environments. However, IRL is computationally expensive when applied to large populations of demonstrators, as existing IRL algorithms require solving a separate reinforcement learning (RL) problem for each individual. We propose a new IRL approach that relies on contextual RL, where an optimal policy is learned for multiple contexts. We first learn a contextual policy that provides the RL solution directly for a parametric family of reward functions, and then re-use it for IRL on each individual within the population. We motivate our method within the scenario of AI-driven playtesting of videogames, and focus on an interpretable family of reward functions. We evaluate the method on a navigation task and the battle arena game Derk, where it successfully recovers distinct player reward preferences from a simulated population and provides substantial time savings compared to a solid baseline of adversarial IRL.
Original language | English |
---|---|
Number of pages | 23 |
Journal | Transactions on Machine Learning Research |
Publication status | Published - 10 Jul 2024 |
MoE publication type | A1 Journal article-refereed |
Fingerprint
Dive into the research topics of 'Contextual Policies Enable Efficient and Interpretable Inverse Reinforcement Learning for Populations'. Together they form a unique fingerprint.Projects
- 1 Finished
-
-: Finnish Center for Artificial Intelligence
Kaski, S. (Principal investigator)
01/01/2019 → 31/12/2022
Project: Academy of Finland: Other research funding