Imitation-Enhanced Reinforcement Learning With Privileged Smooth Transition for Hexapod Locomotion

Zhelin Zhang, Tie Liu, Liang Ding*, Haoyu Wang, Peng Xu, Huaiguang Yang, Haibo Gao, Zongquan Deng, Joni Pajarinen

*Tämän työn vastaava kirjoittaja

Tutkimustuotos: LehtiartikkeliArticleScientificvertaisarvioitu

43 Lataukset (Pure)

Abstrakti

Deep reinforcement learning (DRL) methods have shown significant promise in controlling the movement of quadruped robots. However, for systems like hexapod robots, which feature a higher-dimensional action space, it remains challenging for an agent to devise an effective control strategy directly. Currently, no hexapod robots have demonstrated highly dynamic motion. To address this, we propose imitation-enhanced reinforcement learning (IERL), a two-stage approach enabling hexapod robots to achieve dynamic motion through direct control using RL methods. Initially, imitation learning (IL) replicates a basic positional control method, creating a pre-trained policy for basic locomotion. Subsequently, the parameters from this model are utilized as the starting point for the reinforcement learning process to train the agent. Moreover, we incorporate a smooth transition (ST) method to make IERL overcome the changes in network inputs between two stages, and adaptable to various complex network architectures incorporating latent features. Extensive simulations and real-world experiments confirm that our method effectively tackles the high-dimensional action space challenges of hexapod robots, significantly enhancing learning efficiency and enabling more natural, efficient, and dynamic movements compared to existing methods.

AlkuperäiskieliEnglanti
Sivut350-357
Sivumäärä8
JulkaisuIEEE Robotics and Automation Letters
Vuosikerta10
Numero1
Varhainen verkossa julkaisun päivämäärä2024
DOI - pysyväislinkit
TilaJulkaistu - tammik. 2025
OKM-julkaisutyyppiA1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä

Sormenjälki

Sukella tutkimusaiheisiin 'Imitation-Enhanced Reinforcement Learning With Privileged Smooth Transition for Hexapod Locomotion'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä