Scaling up Robot Skill Learning with Generative Simulation

Master's Thesis, Tech. Report, CMU-RI-TR-24-35, July, 2024

View Publication

Abstract

Generalist robots need to learn a wide variety of skills to perform diverse tasks across multiple environments. Current robot training pipelines rely on humans to either provide kinesthetic demonstrations or program simulation environments with manually-designed reward functions for reinforcement learning. Such human involvement is an important bottleneck towards scaling up robot learning across diverse tasks and environments.

In this thesis, we present Generation to Simulation (Gen2Sim), a method for scaling up robot skill learning in simulation by automating generation of 3D assets, task descriptions, task decompositions and reward functions using large pre-trained generative models of language and vision. We generate 3D assets for simulation by lifting open-world 2D object-centric images to 3D using image diffusion models and querying LLMs to deter- mine plausible physics parameters. Given URDF files of generated and human-developed assets, we chain-of-thought prompt LLMs to map these to relevant task descriptions, temporal decompositions, and corresponding python reward functions for reinforcement learning. We show Gen2Sim succeeds in learning policies for diverse long horizon tasks, where reinforcement learning with non temporally decomposed reward functions fails.

Gen2Sim provides a viable path for scaling up robot skill learning in simulation, both by diversifying and expanding task and environment development, and by facilitating the discovery of reinforcement-learned behaviors through temporal task decomposition in RL. Our work contributes hundreds of simulated assets, tasks and demonstrations, taking a step towards fully autonomous robotic manipulation skill acquisition in simulation.

BibTeX

@mastersthesis{Katara-2024-142667,
author = {Pushkal Katara},
title = {Scaling up Robot Skill Learning with Generative Simulation},
year = {2024},
month = {July},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-24-35},
keywords = {Robot Learning, Generative Models, Robotic Simulation},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.