Learning via Visual-Tactile Interaction - Robotics Institute Carnegie Mellon University
Loading Events

PhD Thesis Proposal

October

12
Wed
Helen Jiang PhD Student Robotics Institute,
Carnegie Mellon University
Wednesday, October 12
1:00 pm to 2:30 pm
NSH 3305
Learning via Visual-Tactile Interaction
Abstract:

Humans learn by interacting with their surroundings using all of their senses. The first of these senses to develop is touch, and it is the first way that young humans explore their environment, learn about objects, and tune their cost functions (via pain or treats). Yet, robots are often denied this highly informative and fundamental sensory information, instead relying fully on visual systems. In this thesis proposal, we explore how combining tactile sensing with visual understanding can improve how robots learn from interaction.

We begin by understanding how robots can learn from visual interaction alone. We propose the concept of semantic curiosity, which rewards temporal inconsistencies in object detections in a trajectory and is used as an intrinsic motivation reward to train an exploration policy. Our experiments demonstrate that exploration driven by semantic curiosity leads to a better object detection performance.

Next, we propose PoseIt, a visual and tactile dataset for understanding how holding pose influences the grasp. We train a classifier to predict grasp stability from the multi-modal input, and find that it generalizes well to new objects and new poses.

We then focus on more fine-grained object manipulation. Thin, malleable objects, such as cables, are particularly susceptible to severe gripper/object occlusions, creating significant challenges in continuously sensing the cable state from vision alone. We propose using visual perception and tactile-guided motion primitives to handle cable routing and assembly.

Finally, we propose to incorporate human demonstrations to better teach robots to use vision and tactile feedback to complete challenging tasks. We design an easy-to-use hand-held gripper to collect human demonstrations equipped with an ego-centric camera and tactile sensors. We will use the collected demonstrations to train a visual- and tactile-conditioned policy for imitation learning.

Thesis Committee Members:
Wenzhen Yuan, Chair
Abhinav Gupta
David Held
Adithya Murali, Nvidia Research