Fully autonomous earth-moving heavy equipment able to operate without human intervention can be seen as the primary goal of automated earth construction. To achieve this objective requires that the machines have the ability to adapt autonomously to complex and changing environments. Recent developments in automation have focused on the application of different machine learning approaches, of which the use of reinforcement learning algorithms is considered the most promising. The key advantage of reinforcement learning is the ability of the system to learn, adapt and work independently in a dynamic environment. This article investigates an application of reinforcement learning algorithm for heavy mining machinery automation. To this end, the training associated with reinforcement learning is done using the multibody approach. The procedure used combines a multibody approach and proximal policy optimization with a covariance matrix adaptation learning algorithm to simulate an autonomous excavator. The multibody model includes a representation of the hydraulic system, multiple sensors observing the state of the excavator and deformable ground. The task of loading a hopper with soil taken from a chosen point on the ground is simulated. The excavator is trained to load the hopper effectively within a given time while avoiding collisions with the ground and the hopper. The proposed system demonstrates the desired behavior after short training times.