Carnegie Mellon University
12:30 pm to 2:00 pm
NSH 4305
Title: Unsupervised Learning for 3D Reconstruction and Blocks World Representation
Abstract: Recovering the dense 3D structure of a scene from its images has been a long-standing goal in computer vision. Recent years have seen attempts of encoding richer priors into the geometry-based pipelines with the introduction of learning based methods. We argue that the form of 3D supervision required by such methods is too onerous, is not naturally available, and it is therefore of both practical and scientific interest to pursue solutions that do not rely on such 3D supervision.
In this thesis, we attempt to bridge the worlds of geometric modeling and deep learning — how to use geometric constraints for obtaining supervisory signal for the task of reconstructing and representing the 3D world efficiently. We first present an unsupervised learning based approach for 3D reconstruction, the output of which is a 3D point cloud. When trained with our proposed robust photometric consistency objective, deep MVS models produce significantly better 3D reconstructions.
In order to represent the reconstructions efficiently, we draw inspiration from Larry Roberts’ famous Blocks World of 1965. We introduce a deep learning framework that enables representing 3D point clouds as an assembly of blocks giving way to a lightweight representation with a several orders of magnitude reduction in memory. We describe how geometric relationships between points and surfaces along with physical priors can be utilized to provide supervisory signal for training deep models. We also present a synthetic-to-real transfer learning setup with a differentiable matching loss that facilitates supervised learning of such blocks world representations.
Committee:
Martial Hebert, chair
Abhinav Gupta
Adam Harley