Guided Learning of Robust Hurdling Policies with Curricular Trajectory Optimization

Abstract

We combine analytical and learning-based techniques to help researchers solve challenging robot locomotion problems. Specifically, we explore the combination of curricular trajectory optimization (CTO) and deep reinforcement learning (RL) for quadruped hurdling tasks. Our framework enables engineers and researchers to get the generalization capabilities of learned policies and the efficiency of trajectory optimization. We gener- ate trajectories from a curricular optimization algorithm, as an imitation learning supervisor to an RL algorithm. We evaluate our approach on various robot hurdling tasks where the robot needs to jump over an obstacle of varying size and location. We achieve greater sample efficiency than state-of-the-art reinforce- ment learning when solving the task, and significantly greater performance than the original trajectories. Results can be seen at https://sites.google.com/usc.edu/cto-rl.

Publication
In Souther Caliornia Robotics Symposium

See project website for more information.

Gautam Salhotra
Gautam Salhotra
PhD candidate in robotics

Researcher in robot manipulation and learning