Scaling, Automating and Adapting Sim-to-real Policy Learning - Robotics Institute Carnegie Mellon University
Loading Events

PhD Thesis Proposal

April

17
Thu
Yufei Wang PhD Student Robotics Institute,
Carnegie Mellon University
Thursday, April 17
11:00 am to 12:30 pm
GHC 6115
Scaling, Automating and Adapting Sim-to-real Policy Learning

Abstract:
Building a generalist robot capable of performing diverse tasks in unstructured environments remains a longstanding challenge. A recent trend in robot learning aims to address this by scaling up demonstration datasets for imitation learning. However, most large-scale robotics datasets are collected in the real-world, often via manual teleoperation. This process is labor-intensive, slow, hardware-dependent, and poses safety risks, limiting its scalability.

Physics-based Simulation offers a scalable, safe, and efficient alternative for generating large demonstration datasets. However, two major challenges remain: (1) large manual effort is required to design simulation assets, scenes, and create training supervision such as reward functions, and (2) the sim-to-real gap in both sensing and dynamics can hinder real-world deployment of simulation-trained policies.

In this thesis, we explore using simulation to generate large-scale datasets to learn robotic manipulation policies that generalize across diverse objects and environments, while addressing the above challenges. We will discuss the following three lines of work:

    1. Large-Scale Sim2real Transfer: We show that policies trained on large-scale simulation data, when combined with the right policy representation and observation space, can transfer zero-shot to the real world and generalize to diverse scenarios. We demonstrate this on two complex manipulation tasks: robot-assisted dressing and articulated object manipulation.
    2. Automating Simulation Dataset Generation: We explore how to automate the creation of large simulation dataset, including tasks, assets, scenes, and training supervisions, through a paradigm called generative simulation, and how we can generate complex reward functions using feedbacks from vision language foundation models.
    3. Efficient Adaptation of Sim2Real Policy: No simulation is perfect. We have also studied how to use small amounts of additional real-world data to efficiently improve the performance and safety of simulation-trained policies.

Finally, we will propose two potential future work, and solicit feedback on these directions: 1) in-context imitation learning for fast and quick adaptation using only one demonstration; and 2) Multi-task 3D generalist policy learning via combining diverse and large simulation datasets.

Thesis Committee Members:
Zackory Erickson, co-chair
David Held, co-chair
Katerina Fragkiadaki
Chuang Gan, UMass Amherst and MIT-IBM Watson AI Lab
Dieter Fox, University of Washington and Nvidia