Learning Locomotion Skills from MPC in Sensor Space

Majid Khadiv, Avadesh Meduri, Huaijiang Zhu, Ludovic Righetti, Bernhard Schölkopf

Research output: Contribution to journalConference articlepeer-review


Nonlinear model predictive control (NMPC) is one the most powerful tools for generating control policies for legged locomotion. However, the large computation load required for solving optimal control problem at each control cycle hinders its use for embedded control of legged robots. Furthermore, the need for a high-quality state estimation module makes the application of NMPC in real world very challenging, especially for highly agile maneuvers. In this paper, we propose to use NMPC as an expert and learn control policies directly from proprioceptive sensory measurements. We perform an extensive set of simulations on the quadruped robot Solo12 and show that it is possible to learn different gaits using only proprioceptive sensory information and without any camera or lidar which are normally used to avoid drift in state estimation. Interestingly, our simulation results show that with the same structure of the function approximators, learning estimator and control policy separately outperforms end-to-end learning of dynamic gaits such as jump and bound. A summary of simulation experiments can be found here.

Original languageEnglish (US)
Pages (from-to)1218-1230
Number of pages13
JournalProceedings of Machine Learning Research
StatePublished - 2023
Event5th Annual Conference on Learning for Dynamics and Control, L4DC 2023 - Philadelphia, United States
Duration: Jun 15 2023Jun 16 2023


  • Agile locomotion
  • Control in sensor space
  • learning from MPC

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability


Dive into the research topics of 'Learning Locomotion Skills from MPC in Sensor Space'. Together they form a unique fingerprint.

Cite this