TY - JOUR
T1 - Continuous Control Monte Carlo Tree Search Informed by Multiple Experts
AU - Rajamaki, Joose Julius
AU - Hamalainen, Perttu
PY - 2018/7/2
Y1 - 2018/7/2
N2 - Efficient algorithms for 3D character control in continuous control setting remain an open problem in spite of the remarkable recent advances in the field. We present a sampling-based model-predictive controller that comes in the form of a Monte Carlo tree search (MCTS). The tree search utilizes information from multiple sources including two machine learning models. This allows rapid development of complex skills such as 3D humanoid locomotion with less than a million simulation steps, in less than a minute of computing on a modest personal computer. We demonstrate locomotion of 3D characters with varying topologies under disturbances such as heavy projectile hits and abruptly changing target direction. In this paper we also present a new way to combine information from the various sources such that minimal amount of information is lost. We furthermore extend the neural network, involved in the algorithm, to represent stochastic policies. Our approach yields a robust control algorithm that is easy to use. While learning, the algorithm runs in near real-time, and after learning the sampling budget can be reduced for real-time operation.
AB - Efficient algorithms for 3D character control in continuous control setting remain an open problem in spite of the remarkable recent advances in the field. We present a sampling-based model-predictive controller that comes in the form of a Monte Carlo tree search (MCTS). The tree search utilizes information from multiple sources including two machine learning models. This allows rapid development of complex skills such as 3D humanoid locomotion with less than a million simulation steps, in less than a minute of computing on a modest personal computer. We demonstrate locomotion of 3D characters with varying topologies under disturbances such as heavy projectile hits and abruptly changing target direction. In this paper we also present a new way to combine information from the various sources such that minimal amount of information is lost. We furthermore extend the neural network, involved in the algorithm, to represent stochastic policies. Our approach yields a robust control algorithm that is easy to use. While learning, the algorithm runs in near real-time, and after learning the sampling budget can be reduced for real-time operation.
KW - Continuous Control
KW - Learning (artificial intelligence)
KW - Monte Carlo methods
KW - Monte Carlo Tree Search
KW - Neural networks
KW - Planning
KW - Predictive models
KW - Real-time systems
KW - Reinforcement Learning
KW - Three-dimensional displays
UR - http://www.scopus.com/inward/record.url?scp=85049346027&partnerID=8YFLogxK
U2 - 10.1109/TVCG.2018.2849386
DO - 10.1109/TVCG.2018.2849386
M3 - Article
AN - SCOPUS:85049346027
SN - 1077-2626
VL - 25
SP - 2540
EP - 2553
JO - IEEE Transactions on Visualization and Computer Graphics
JF - IEEE Transactions on Visualization and Computer Graphics
IS - 8
ER -