Scene Understanding from RGB-D Images - Robotics Institute Carnegie Mellon University
Loading Events

VASC Seminar

April

11
Mon
Saurabh Gupta Graduate Student University of California, Berkeley
Monday, April 11
3:00 pm to 4:00 pm
Scene Understanding from RGB-D Images

Event Location: 1507 Newell Simon Hall
Bio: Saurabh Gupta is a Ph.D. student at UC Berkeley, where he is advised by Jitendra Malik. His research interests include computer vision and machine learning. During his PhD he has studied the problem of scene understanding from RGB-D images. His work has been supported by the Berkeley Fellowship and the Google Fellowship in Computer Vision.

Abstract: The focus of this talk will be on detailed scene understanding from RGB-D images. We approach this problem by studying a variety of central vision problems like bottom-up grouping, object detection, instance segmentation, and pose estimation in context of RGB-D images, and finally aligning CAD models to objects in the scene. This results in a detailed output which goes beyond what most current computer vision algorithms produce: a bounding box or a segmentation mask for the object of interest; and is useful for a variety of real world applications like perceptual robotics, and augmented reality. A central question in this work is how to learn good features for depth images in view of the fact that labeled RGB-D datasets are much smaller than labeled RGB datasets (such as ImageNet) typically used for feature learning, and I will describe our “cross-modal distillation” technique which allows us to leverage easily available annotations on RGB images to learn representations on depth images. In addition, I will very briefly also talk about some work on vision and language that I did on an internship at Microsoft Research.