Abstract
In reinforcement learning, tuning reward weights in the reward function is necessary to align behavior with user preferences. However, current approaches, which use pairwise comparisons for preference elicitation, are inefficient, because they miss much of the human ability to explore and judge groups of candidate solutions. The paper presents a novel visualization-based approach that better exploits the user’s ability to quickly recognize interesting directions for reward tuning. It breaks down the tuning problem by using the visual information-seeking principle: overview first, zoom and filter, then details-on-demand. Following this principle, we built a visualization system comprising two interactively linked views: 1) an embedding view showing a contextual overview of all sampled behaviors and 2) a sample view displaying selected behaviors and visualizations of the detailed time-series data. A user can efficiently explore large sets of samples by iterating between these two views. The paper demonstrates that the proposed approach is capable of tuning rewards for challenging behaviors. The simulation-based evaluation shows that the system can reach optimal solutions with fewer queries relative to baselines.
Original language | English |
---|---|
Title of host publication | 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Publisher | IEEE |
Number of pages | 8 |
ISBN (Electronic) | 979-8-3503-7770-5 |
DOIs | |
Publication status | Published - 2024 |
MoE publication type | A4 Conference publication |
Event | IEEE/RSJ International Conference on Intelligent Robots and Systems - ADNEC, Abu Dhabi, United Arab Emirates Duration: 14 Oct 2024 → 18 Oct 2024 https://iros2024-abudhabi.org/ |
Publication series
Name | Proceedings of the International Conference on Intelligent Robots and Systems |
---|---|
ISSN (Electronic) | 2153-0866 |
Conference
Conference | IEEE/RSJ International Conference on Intelligent Robots and Systems |
---|---|
Abbreviated title | IROS |
Country/Territory | United Arab Emirates |
City | Abu Dhabi |
Period | 14/10/2024 → 18/10/2024 |
Internet address |