VendiRL : A Framework for Self-Supervised Reinforcement Learning of Diversely Diverse Skills

Research output: Contribution to conferencePaperScientificpeer-review

Abstract

In self-supervised reinforcement learning (RL), one of the key challenges is learning a diverse set of skills to prepare agents for unknown future tasks. Despite impressive advances, scalability and evaluation remain prevalent issues. Regarding scalability, the search for meaningful skills can be obscured by high-dimensional feature spaces, where relevant features may vary across downstream task domains. For evaluating skill diversity, defining what constitutes "diversity" typically requires a hard commitment to a specific notion of what it means for skills to be diverse, potentially leading to inconsistencies in how skill diversity is understood, making results across different approaches hard to compare, and leaving many forms of diversity unexplored. To address these issues, we adopt a measure of sample diversity that translates ideas from ecology to machine learning—the Vendi Score—allowing the user to specify and evaluate any desired form of diversity. We demonstrate how this metric facilitates skill evaluation and introduce VendiRL, a unified framework for learning diversely diverse sets of skills. Given distinct similarity functions, VendiRL motivates distinct forms of diversity, which could support skill-diversity pretraining in new and richly interactive environments where optimising for various forms of diversity may be desirable.
Original languageEnglish
Number of pages17
DOIs
Publication statusPublished - 3 Sept 2025
MoE publication typeNot Eligible
EventScaling Environments for Agents - San Diego, United States
Duration: 7 Dec 20257 Dec 2025
https://sea-workshop.github.io/

Workshop

WorkshopScaling Environments for Agents
Abbreviated titleSEA
Country/TerritoryUnited States
CitySan Diego
Period07/12/202507/12/2025
OtherNeurIPS Workshop on Scaling Environments for Agents
Internet address

Funding

This work was funded in part by the Research Council of Finland’s NEXT-IM project (grant no. 349036). The poster was designed while being supported by a Researchers Abroad grant co-funded by KAUTE Foundation, Walter Ahlström Foundation, Foundation for Economic Education, and Nokia Foundation. My conference trip was funded in part by the Helsinki Institute for Information Technology (HIIT).

Keywords

  • reinforcement learning
  • self-supervised learning
  • skill diversity
  • task formulation
  • evaluation

Fingerprint

Dive into the research topics of 'VendiRL : A Framework for Self-Supervised Reinforcement Learning of Diversely Diverse Skills'. Together they form a unique fingerprint.

Cite this