Disturbance Observer-based Control Barrier Functions with Residual Model Learning for Safe Reinforcement Learning

Dvij Kalaria, Qin Lin, and John M. Dolan

Workshop Paper, ICRA Workshop on Agile Robots, May, 2024

View Publication

Abstract

Reinforcement learning (RL) agents need to explore their environment to learn optimal behaviors and achieve maximum rewards. However, exploration can be risky when training RL directly on real systems, while simulation-based training introduces the tricky issue of the sim-to-real gap. Recent approaches have leveraged safety filters, such as control barrier functions (CBFs), to penalize unsafe actions during RL training. However, the strong safety guarantees of CBFs rely on a precise dynamic model. In practice, uncertainties always exist, including internal disturbances from the errors of dynamics and external disturbances such as wind. In this work, we propose a new safe RL framework based on disturbance rejection-guarded learning, which allows for an almost model-free RL with an assumed but not necessarily precise nominal dynamic model. We demonstrate our results on the Safety-gym benchmark for Point and Car robots on all tasks where we can
outperform state-of-the-art approaches that use only residual model learning or a disturbance observer (DOB). We further validate the efficacy of our framework using a physical F1/10 racing car. Videos: https://sites.google.com/view/res-dob-cbf-rl

BibTeX

@workshop{Kalaria-2024-143952,
author = {Dvij Kalaria and Qin Lin and John M. Dolan},
title = {Disturbance Observer-based Control Barrier Functions with Residual Model Learning for Safe Reinforcement Learning},
booktitle = {Proceedings of ICRA Workshop on Agile Robots},
year = {2024},
month = {May},
keywords = {Safe Reinforcement Learning, Robust Control Barrier Functions, Disturbance Observer, Residual Model Learning},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.