Projekteja vuodessa
Abstrakti
The cooperation among AI systems, and between AI systems and humans is becoming increasingly important. In various real-world tasks, an agent needs to cooperate with unknown partner agent types. This requires the agent to assess the behaviour of the partner agent during a cooperative task and to adjust its own policy to support the cooperation. Deep reinforcement learning models can be trained to deliver the required functionality but are known to suffer from sample inefficiency and slow learning. However, adapting to a partner agent behaviour during the ongoing task requires ability to assess the partner agent type quickly. We suggest a method, where we synthetically produce populations of agents with different behavioural patterns together with ground truth data of their behaviour, and use this data for training a meta-learner. We additionally suggest an agent architecture, which can efficiently use the generated data and gain the meta-learning capability. When an agent is equipped with such a meta-learner, it is capable of quickly adapting to cooperation with unknown partner agent types in new situations. This method can be used to automatically form a task distribution for meta-training from emerging behaviours that arise, for example, through self-play.
Alkuperäiskieli | Englanti |
---|---|
Otsikko | Artificial Neural Networks and Machine Learning – ICANN 2021 - 30th International Conference on Artificial Neural Networks, Proceedings |
Toimittajat | Igor Farkaš, Paolo Masulli, Sebastian Otte, Stefan Wermter |
Kustantaja | Springer |
Sivut | 493-504 |
Sivumäärä | 12 |
ISBN (painettu) | 9783030863791 |
DOI - pysyväislinkit | |
Tila | Julkaistu - 2021 |
OKM-julkaisutyyppi | A4 Artikkeli konferenssijulkaisussa |
Tapahtuma | International Conference on Artificial Neural Networks - Virtual, Online Kesto: 14 syysk. 2021 → 17 syysk. 2021 Konferenssinumero: 30 |
Julkaisusarja
Nimi | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Vuosikerta | 12894 LNCS |
ISSN (painettu) | 0302-9743 |
ISSN (elektroninen) | 1611-3349 |
Conference
Conference | International Conference on Artificial Neural Networks |
---|---|
Lyhennettä | ICANN |
Kaupunki | Virtual, Online |
Ajanjakso | 14/09/2021 → 17/09/2021 |
Sormenjälki
Sukella tutkimusaiheisiin 'Behaviour-Conditioned Policies for Cooperative Reinforcement Learning Tasks'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.Projektit
- 2 Päättynyt
-
Interaktiivinen koneoppiminen useista biodatalähteistä
Kaski, S. (Vastuullinen tutkija), Hämäläinen, A. (Projektin jäsen), Gadd, C. (Projektin jäsen), Hegde, P. (Projektin jäsen), Shen, Z. (Projektin jäsen), Siren, J. (Projektin jäsen), Trinh, T. (Projektin jäsen), Jain, A. (Projektin jäsen) & Jälkö, J. (Projektin jäsen)
01/01/2019 → 31/08/2021
Projekti: Academy of Finland: Other research funding
-
Interaktiivinen koneoppiminen useista biodatalähteistä
Kaski, S. (Vastuullinen tutkija) & Filstroff, L. (Projektin jäsen)
01/01/2016 → 31/08/2021
Projekti: Academy of Finland: Other research funding