Abstract:
Navigating quadruped robots through complex, unstructured environments over long horizons remains a significant challenge in robotics. Traditional planning methods offer guarantees such as optimality and long-horizon reasoning, while learning-based methods, particularly those involving deep reinforcement learning (DRL), provide robustness and generalization. In this thesis, we present S3D-OWNS (Skilled 3D-Optimal Waypoint Navigation System), a novel framework that combines the strengths of both approaches to achieve efficient and adaptive quadrupedal locomotion. Our framework integrates a high-level planner with a generalist DRL-trained locomotion policy. The planner is responsible for long-horizon navigation and optimal path planning, while the learned policy handles agile, real-time locomotion tasks such as walking, jumping, and climbing. By leveraging powerful DRL-trained policies, we reduce the dimensionality of the planning state space, making planning more efficient and allowing the locomotion policy to handle complex maneuvers. This combination enables our system to navigate cluttered environments while optimizing for energy consumption, time efficiency, and task success. The S3D-OWNS framework includes several key innovations:
-
A goal-conditioned locomotion policy trained across diverse terrains using DRL, which adds robustness and adaptability.
-
A sampling-based planner that reasons over whether obstacles are traversable based on the robot’s capabilities, enabling more efficient path planning compared to traditional obstacle avoidance techniques.
-
Cost predictors trained using GPU parallelization to estimate energy consumption, time cost, and success likelihood for each planned path segment.
Through extensive experimentation with the Unitree Go1 quadruped in simulated environments featuring obstacles such as blocks and gaps, we demonstrate that S3D-OWNS significantly outperforms traditional collision-avoidance planners. Our ablation studies show that the framework not only improves navigation efficiency but also reduces long-term operational costs by optimizing energy use. The system’s ability to reason about obstacle traversability allows it to plan more effectively while leveraging the agility of the locomotion policy to handle challenging terrain. This research advances the field of quadrupedal robotics by demonstrating how hybrid systems can combine classical planning with modern machine learning techniques to achieve both optimality and adaptability. The S3D-OWNS framework is scalable across different quadruped models and sensor configurations, making it suitable for various applications such as industrial automation, search-and-rescue missions, and exploration in unstructured environments.
Committee:
Prof. Dimitrios (Dimi) Apostolopoulos (co-advisor)
Prof. Maxim Likhachev (co-advisor)
Rishi Veerapaneni