Carnegie Mellon University
Abstract:
Learned policies often fail to generalize across environment variations, such as, different objects, object arrangements, or camera viewpoints. Moreover, most policies are trained and tested in simulation environments, and the sim2real gap remains large under weak visual representations that do not disentangle the scene from the objects in it.
We first propose a visually-grounded library of behaviors approach for learning to manipulate diverse objects across varying initial and goal configurations and camera placements. Our key innovation is to disentangle the standard image-to-action mapping into two separate modules that use different types of perceptual input: (1) a behavior selector which conditions on intrinsic and semantically-rich object appearance features to select the behaviors that can successfully perform the desired tasks on the object in hand, and (2) a library of behaviors each of which conditions on extrinsic and abstract object properties, such as object location and pose, to predict actions to execute over time. We test our framework on pushing and grasping diverse objects in simulation as well as transporting rigid, granular, and liquid food ingredients in a real robot setup. Our model outperforms image-to-action mappings that do not separate static and dynamic object properties.
We then propose an end-to-end learning framework that jointly learns to choose different tools and deploy tool-conditioned policies with a limited amount of human demonstrations directly on a real robot platform. It is important to correctly switch between and deploy suitable tools in object rearrangement and cleaning tasks in complex scenes. We evaluate our method on parallel gripper and suction cup picking and placing, sweeping with a brush, and household rearrangement tasks, generalizing to different configurations, novel objects, and cluttered scenes in the real world. Finally, we show a long-horizon planning framework that could utilize our multiple tool setup to manipulate elastoplastic objects successfully, such as a dough.
Christopher G. Atkeson (advisor)
Katerina Fragkiadaki (co-advisor)
Oliver Kroemer
Thomas Weng