The thesis studies building blocks for robot skill learning. Using these key components, learning frameworks can be constructed which provide robots with the capability to acquire a motion and manipulation skill autonomously. We study skill learning in two contexts: in-contact and free-space motions. In brief, this thesis investigates how to: (1) learn a policy for in-contact tasks; (2) generalize a free-space motion policy to new situations using a contextual skill model (CSM); and (3) transfer the CSM from simulation to real world. Learning an in-contact task such as wood planing from scratch can be time-consuming and dangerous. This problem can be avoided by imitating a policy from a human demonstration. However, a mere imitation may not satisfy the objective of the corresponding in-contact task. The thesis proposes a reinforcement learning (RL) framework for improving the performance of an imitated in-contact policy. The policy search for in-contact tasks has been achieved by making the motion compliant which allows for exploration in the force profile. Generalizing a policy to new situations is fundamental to skill learning as it alleviates the need to learn a new policy in every novel situation. Generalizing a policy refers to synthesizing a function mapping the policy to new situations. The function is referred to as a contextual policy or contextual skill model (CSM). The thesis proposes a parametric CSM. Experiments demonstrated that the parametric CSM can extract a global pattern from a database (DB) of policy parameters leading to significantly better extrapolation capability than with non-parametric CSMs. Furthermore, the underlying model of the CSM is fitted to the DB using a novel model selection approach to better represent the underlying regularities of the task. In order to speed the process of learning, the prediction uncertainty of the CSM is calculated using empirical Bayes (EB) and employed for guiding the exploration process of a model-free policy search. In addition, the most promising task is selected using a novel task manager, allowing for better future generalization performance achieved with minimum effort. In essence, the thesis presents an incremental learning framework,the main components of which are as follows: CSM, policy search, model selection, DB, EB, and a task manager implemented using active learning. Learning a policy in a simulated environment and transferring it to the real world will alleviate the need to learn from scratch or from a demonstration. The thesis proposes to transfer a CSM instead of transferring a single control policy. We developed a simulation-to-real transfer framework which learns a source CSM in simulation incrementally and transfers it to the real world incrementally. Transference of the source CSM has been achieved using sample policies from the target environment. Experiments indicated that one sample policy is sufficient to transfer a CSM to the target environment. The target CSM improved the extrapolation capability significantly better than zero-shot transfer.
|Translated title of the contribution||Incremental and Transfer Learning of Contextual Skill Model for Robots|
|Publication status||Published - 2019|
|MoE publication type||G5 Doctoral dissertation (article)|
- reinforcement learning
- active incremental learning
- transfer learning