Automated Excavator Based on Reinforcement Learning and Multibody System Dynamics

Ilya Kurinov, Grzegorz Orzechowski, Perttu Hämäläinen, Aki Mikkola

Research output: Contribution to journalArticleScientificpeer-review

1 Citation (Scopus)
58 Downloads (Pure)


Fully autonomous earth-moving heavy equipment able to operate without human intervention can be seen as the primary goal of automated earth construction. To achieve this objective requires that the machines have the ability to adapt autonomously to complex and changing environments. Recent developments in automation have focused on the application of different machine learning approaches, of which the use of reinforcement learning algorithms is considered the most promising. The key advantage of reinforcement learning is the ability of the system to learn, adapt and work independently in a dynamic environment. This article investigates an application of reinforcement learning algorithm for heavy mining machinery automation. To this end, the training associated with reinforcement learning is done using the multibody approach. The procedure used combines a multibody approach and proximal policy optimization with a covariance matrix adaptation learning algorithm to simulate an autonomous excavator. The multibody model includes a representation of the hydraulic system, multiple sensors observing the state of the excavator and deformable ground. The task of loading a hopper with soil taken from a chosen point on the ground is simulated. The excavator is trained to load the hopper effectively within a given time while avoiding collisions with the ground and the hopper. The proposed system demonstrates the desired behavior after short training times.
Original languageEnglish
Article number9268069
Pages (from-to)213998-214006
Number of pages9
JournalIEEE Access
Early online date24 Nov 2020
Publication statusPublished - 10 Dec 2020
MoE publication typeA1 Journal article-refereed

Fingerprint Dive into the research topics of 'Automated Excavator Based on Reinforcement Learning and Multibody System Dynamics'. Together they form a unique fingerprint.

Cite this