Deep Learning for Tactile Sensing: Development to Deployment - Robotics Institute Carnegie Mellon University
Loading Events

PhD Thesis Defense

August

21
Wed
Raunaq Mahesh Bhirangi PhD Student Robotics Institute,
Carnegie Mellon University
Wednesday, August 21
1:00 pm to 3:00 pm
NSH 1305
Deep Learning for Tactile Sensing: Development to Deployment

Abstract:
The role of sensing is widely acknowledged for robots interacting with the physical environment. However, few contemporary sensors have gained widespread use among roboticists. This thesis proposes a framework for incorporating sensors into a robot learning paradigm, from development to deployment, through the lens of ReSkin — a versatile and scalable magnetic tactile sensor. By examining design, integration, representation learning and policy learning in the context of ReSkin, we aim to provide guidance on the implementation of effective sensing systems for robotics.

We begin with the design of ReSkin — a low-cost, compact, and diverse platform for tactile sensing, and propose a self-supervised learning technique that enables sensor replaceability by adapting learned models to generalize to new instances of the sensor. Next, we investigate the scalability of ReSkin in the context of dexterous manipulation: we introduce the D’Manus, an inexpensive, modular, and robust platform with integrated large-area ReSkin sensing, aimed at satisfying the large-scale data collection demands of robot learning. Based on learnings from the development of ReSkin and the D’Manus, we propose MoreSkin — an upgraded sensor with a streamlined fabrication procedure that further reduces variability across sensor instances. MoreSkin is as easy to integrate as putting on a phone case, eliminates the need for adhesion and admits zero-shot transfer of learned policies across sensor instances.

Beyond sensor integration, we explore representation learning for sensors including but not limited to ReSkin. Sensory data is typically sequential and continuous; however, most research on existing sequential architectures like LSTMs and Transformers focuses primarily on discrete modalities such as text and DNA. To address this gap, we propose Hierarchical State Space (HiSS) models, a conceptually simple and novel technique for continuous sequential prediction. HiSS creates a temporal hierarchy by stacking structured state-space models on top of each other, and outperforms state-of-the-art sequence models such as causal Transformers, LSTMs, S4, and Mamba. Further, we introduce CSP-Bench, a new benchmark for continuous sequence-to-sequence prediction (CSP) from real-world sensory data. CSP-Bench aims to address the lack of real-world datasets available for CSP tasks, providing a valuable resource for researchers working in this area.

Finally, we deploy ReSkin and MoreSkin in a policy learning setting and explore the interplay between different modalities in precise manipulation. Specifically, we investigate the effectiveness of overhead cameras, wrist cameras and tactile sensors in learning policies for tasks requiring millimeter level precision, and discuss the strengths and weaknesses of each modality. We conclude by summarizing our takeaways through the journey of ReSkin from development to deployment, and outline promising directions for bringing tactile sensing into the fold of mainstream robotics research.

Thesis Committee Members:
Abhinav Gupta, Co-Chair
Carmel Majidi, Co-Chair
Deepak Pathak
Lerrel Pinto, New York University