Abstrakti

We consider the problem of creating assistants that can help agents solve new sequential decision problems, assuming the agent is not able to specify the reward function explicitly to the assistant. Instead of acting in place of the agent as in current automation-based approaches, we give the assistant an advisory role and keep the agent in the loop as the main decision maker. The difficulty is that we must account for potential biases of the agent which may cause it to seemingly irrationally reject advice. To do this we introduce a novel formalization of assistance that models these biases, allowing the assistant to infer and adapt to them. We then introduce a new method for planning the assistant's actions which can scale to large decision making problems. We show experimentally that our approach adapts to these agent biases, and results in higher cumulative reward for the agent than automation-based alternatives. Lastly, we show that an approach combining advice and automation outperforms advice alone at the cost of losing some safety guarantees.
AlkuperäiskieliEnglanti
OtsikkoAAAI-23 Technical Tracks 10
ToimittajatBrian Williams, Yiling Chen, Jennifer Neville
KustantajaAAAI Press
Sivut11551-11559
Sivumäärä9
ISBN (elektroninen)978-1-57735-880-0
DOI - pysyväislinkit
TilaJulkaistu - 27 kesäk. 2023
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisussa
TapahtumaAAAI Conference on Artificial Intelligence - Walter E. Washington Convention Center, Washington, Yhdysvallat
Kesto: 7 helmik. 202314 helmik. 2023
Konferenssinumero: 37
https://aaai-23.aaai.org/

Julkaisusarja

NimiProceedings of the AAAI Conference on Artificial Intelligence
Numero10
Vuosikerta37
ISSN (elektroninen)2374-3468

Conference

ConferenceAAAI Conference on Artificial Intelligence
LyhennettäAAAI
Maa/AlueYhdysvallat
KaupunkiWashington
Ajanjakso07/02/202314/02/2023
www-osoite

Sormenjälki

Sukella tutkimusaiheisiin 'Zero-Shot Assistance in Sequential Decision Problems'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä