Understanding the Evolution of Linear Regions in Deep Reinforcement Learning

Setareh Cohan*, Nam Hee Kim, David Rolnick, Michiel van de Panne

*Tämän työn vastaava kirjoittaja

Tutkimustuotos: Artikkeli kirjassa/konferenssijulkaisussaConference article in proceedingsScientificvertaisarvioitu

Abstrakti

Policies produced by deep reinforcement learning are typically characterised by their learning curves, but they remain poorly understood in many other respects. ReLU-based policies result in a partitioning of the input space into piecewise linear regions. We seek to understand how observed region counts and their densities evolve during deep reinforcement learning using empirical results that span a range of continuous control tasks and policy network dimensions. Intuitively, we may expect that during training, the region density increases in the areas that are frequently visited by the policy, thereby affording fine-grained control. We use recent theoretical and empirical results for the linear regions induced by neural networks in supervised learning settings for grounding and comparison of our results. Empirically, we find that the region density increases only moderately throughout training, as measured along fixed trajectories coming from the final policy. However, the trajectories themselves also increase in length during training, and thus the region densities decrease as seen from the perspective of the current trajectory. Our findings suggest that the complexity of deep reinforcement learning policies does not principally emerge from a significant growth in the complexity of functions observed on-and-around trajectories of the policy.
AlkuperäiskieliEnglanti
OtsikkoAdvances in Neural Information Processing Systems 35 (NeurIPS 2022)
ToimittajatS. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, A. Oh
KustantajaMorgan Kaufmann Publishers
Sivumäärä13
ISBN (painettu)9781713871088
TilaJulkaistu - 2022
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisussa
TapahtumaConference on Neural Information Processing Systems - New Orleans, Yhdysvallat
Kesto: 28 marrask. 20229 jouluk. 2022
Konferenssinumero: 36
https://nips.cc/

Julkaisusarja

NimiAdvances in Neural Information Processing Systems
KustantajaMorgan Kaufmann Publishers
Vuosikerta35
ISSN (painettu)1049-5258

Conference

ConferenceConference on Neural Information Processing Systems
LyhennettäNeurIPS
Maa/AlueYhdysvallat
KaupunkiNew Orleans
Ajanjakso28/11/202209/12/2022
www-osoite

Sormenjälki

Sukella tutkimusaiheisiin 'Understanding the Evolution of Linear Regions in Deep Reinforcement Learning'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä