Marr Revisited: 2D-3D Alignment via Surface Normal Prediction
Abstract
We introduce an approach that leverages surface normal predictions, along with appearance cues, to retrieve 3D models for objects depicted in 2D still images from a large CAD object library. Critical to the success of our approach is the ability to recover accurate surface normals for objects in the depicted scene. We introduce a skip-network model built on the pre-trained Oxford VGG convolutional neural network (CNN) for surface normal prediction. Our model achieves state-of-the-art accuracy on the NYUv2 RGB-D dataset for surface normal prediction, and recovers fine object detail compared to previous methods. Furthermore, we develop a two-stream network over the input image and predicted surface normals that jointly learns pose and style for CAD model retrieval. When using the predicted surface normals, our two-stream network matches prior work using surface normals computed from RGB-D images on the task of pose prediction, and achieves state of the art when using RGB-D input. Finally, our two-stream network allows us to retrieve CAD models that better match the style and pose of a depicted object compared with baseline approaches.
BibTeX
@conference{Bansal-2016-113342,author = {Aayush Bansal and Bryan Russell and Abhinav Gupta},
title = {Marr Revisited: 2D-3D Alignment via Surface Normal Prediction},
booktitle = {Proceedings of (CVPR) Computer Vision and Pattern Recognition},
year = {2016},
month = {June},
pages = {5965 - 5974},
}