Energy-based Joint Pose Estimation for 3D Reconstruction - Robotics Institute Carnegie Mellon University
Loading Events

PhD Speaking Qualifier

April

19
Tue
Jason Zhang PhD Student Robotics Institute,
Carnegie Mellon University
Tuesday, April 19
2:00 pm to 3:00 pm
Energy-based Joint Pose Estimation for 3D Reconstruction

Abstract:
In this talk, I will describe a data-driven method for inferring camera poses given a sparse collection of images of an arbitrary object. This task is a core component of classic geometric pipelines such as structure-from-motion (SFM), and also serves as a vital pre-processing requirement for contemporary neural approaches (e.g. NeRF) to object reconstruction. In contrast to existing correspondence-driven methods that do not perform well given sparse views, we propose a top-down prediction driven approach for estimating camera poses. Our key technical insight is the use of an energy-based formulation for representing distributions over relative camera transformations, thus allowing us to explicitly represent multiple camera modes arising from object symmetries and or views. Leveraging these relative predictions, we jointly estimate a consistent set of camera poses from multiple images. We show that our approach outperforms state-of-the-art SfM and SLAM methods and direct pose regression given sparse images on both seen and unseen categories. Our system can be a stepping stone toward in-the-wild reconstruction from multi-view datasets.

Committee:
Deva Ramanan
Abhinav Gupta
David Held
Brian Okorn