Interactive Reward Tuning: Interactive Visualization for Preference Elicitation

Tutkimustuotos: Artikkeli kirjassa/konferenssijulkaisussaConference article in proceedingsScientificvertaisarvioitu

59 Lataukset (Pure)

Abstrakti

In reinforcement learning, tuning reward weights in the reward function is necessary to align behavior with user preferences. However, current approaches, which use pairwise comparisons for preference elicitation, are inefficient, because they miss much of the human ability to explore and judge groups of candidate solutions. The paper presents a novel visualization-based approach that better exploits the user’s ability to quickly recognize interesting directions for reward tuning. It breaks down the tuning problem by using the visual information-seeking principle: overview first, zoom and filter, then details-on-demand. Following this principle, we built a visualization system comprising two interactively linked views: 1) an embedding view showing a contextual overview of all sampled behaviors and 2) a sample view displaying selected behaviors and visualizations of the detailed time-series data. A user can efficiently explore large sets of samples by iterating between these two views. The paper demonstrates that the proposed approach is capable of tuning rewards for challenging behaviors. The simulation-based evaluation shows that the system can reach optimal solutions with fewer queries relative to baselines.
AlkuperäiskieliEnglanti
Otsikko2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
KustantajaIEEE
Sivumäärä8
ISBN (elektroninen)979-8-3503-7770-5
DOI - pysyväislinkit
TilaJulkaistu - 2024
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisussa
TapahtumaIEEE/RSJ International Conference on Intelligent Robots and Systems - ADNEC, Abu Dhabi, Yhdistyneet arabiemiirikunnat
Kesto: 14 lokak. 202418 lokak. 2024
https://iros2024-abudhabi.org/

Julkaisusarja

Nimi Proceedings of the International Conference on Intelligent Robots and Systems
ISSN (elektroninen)2153-0866

Conference

ConferenceIEEE/RSJ International Conference on Intelligent Robots and Systems
LyhennettäIROS
Maa/AlueYhdistyneet arabiemiirikunnat
KaupunkiAbu Dhabi
Ajanjakso14/10/202418/10/2024
www-osoite

Sormenjälki

Sukella tutkimusaiheisiin 'Interactive Reward Tuning: Interactive Visualization for Preference Elicitation'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä