Carnegie Mellon University
Abstract:
Robust and generalizable robots that can autonomously manipulate objects in semi-structured environments can bring material benefits to society. Data-driven learning approaches are crucial for enabling such systems by identifying and exploiting patterns in semi-structured environments, allowing robots to adapt to novel scenarios with minimal human supervision. However, despite significant prior work in learning for robot manipulation, large gaps remain before robots can be widely deployed in the real world. This thesis addresses three particular challenges to advance toward this goal: sensing in semi-structured environments, adapting manipulation to novel scenarios, and flexible planning for diverse skills and tasks. A common theme among the discussed approaches is enabling efficient and generalizable learning by incorporating “structures,” or priors specific to robot manipulation, into the design and implementation of learning algorithms.
The completed works follow the three challenges above. We first leverage contact-based sensing in scenarios that are difficult for vision-based perception. In one work, we use contact feedback to track in-hand object poses during dexterous manipulation. In another, we learn to localize contacts on the surface of robot arms to enable whole-arm sensing. Then, we explore adapting manipulation to novel objects and environments for both model-based and model-free skills. We show how learning task-oriented interactive perception can improve the performance of downstream model-based skills by identifying relevant dynamics parameters. We also show how incorporating object-centric priors can make learning model-free skills more efficient and generalizable. Lastly, we develop a flexible search-based task planner that relaxes assumptions on skill and task representations from prior works.
The proposed work builds upon the completed work to tackle the problem of contact-rich object rearrangement in constrained environments with structured clutter. This domain is common in everyday settings like homes and offices, and it also presents challenges in perception, action, learning, and planning that current methods do not adequately address. They include reasoning about contact-rich manipulation without object models, maintaining state representations robust to occlusions, and efficiently planning long-horizon tasks with many objects. The algorithms and systems developed here can be applied to other task domains, and they can serve as building blocks for and inform future work on robust and generalizable learning for robot manipulation.
Thesis Committee Members:
Oliver Kroemer, Co-Chair
Maxim Likhachev, Co-Chair
David Held
Shuran Song, Columbia University