Safe-To-Explore State Spaces: Ensuring Safe Exploration in Policy Search with Hierarchical Task Optimization

Jens Lundell, Robert Krug, Erik Schaffernicht, Todor Stoyanov, Ville Kyrki

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

1 Citation (Scopus)


Policy search reinforcement learning allows robots to acquire skills by themselves. However, the learning procedure is inherently unsafe as the robot has no a-priori way to predict the consequences of the exploratory actions it takes. Therefore, exploration can lead to collisions with the potential to harm the robot and/or the environment. In this work we address the safety aspect by constraining the exploration to happen in safe-to-explore state spaces. These are formed by decomposing target skills (e.g., grasping) into higher ranked sub-tasks (e.g., collision avoidance, joint limit avoidance) and lower ranked movement tasks (e.g., reaching). Sub-tasks are defined as concurrent controllers (policies) in different operational spaces together with associated Jacobians representing their joint-space mapping. Safety is ensured by only learning policies corresponding to lower ranked sub-tasks in the redundant null space of higher ranked ones. As a side benefit, learning in sub-manifolds of the state-space also facilitates sample efficiency. Reaching skills performed in simulation and grasping skills performed on a real robot validate the usefulness of the proposed approach.
Original languageEnglish
Title of host publicationProceedings of the 18th IEEE-RAS International Conference on Humanoid Robots, Humanoids 2018
EditorsT. Asfour
Number of pages7
Publication statusPublished - 2018
MoE publication typeA4 Article in a conference publication
EventIEEE-RAS International Conference on Humanoid Robots - Beijing, China
Duration: 6 Nov 20189 Nov 2018
Conference number: 18

Publication series

NameIEEE-RAS International Conference on Humanoid Robots
ISSN (Print)2164-0572
ISSN (Electronic)2164-0580


ConferenceIEEE-RAS International Conference on Humanoid Robots
Abbreviated titleHumanoids


Dive into the research topics of 'Safe-To-Explore State Spaces: Ensuring Safe Exploration in Policy Search with Hierarchical Task Optimization'. Together they form a unique fingerprint.

Cite this