Discriminatively-guided Deliberative Perception for Pose Estimation of Multiple 3D Object Instances
Abstract
We introduce a novel paradigm for model-based multi-object recognition and 3 DoF pose estimation from 3D sensor data that integrates exhaustive global reasoning with discriminatively-trained algorithms in a principled fashion. Typical approaches for this task are based on scene-to-model feature matching or regression by statistical learners trained on a large database of annotated scenes. These approaches are fast but sensitive to occlusions, features, and/or training data. Generative approaches, on the other hand, e.g., methods based on rendering and verification, are robust to occlusions and require no training, but are slow at test time. We conjecture that robust and efficient perception can be achieved through a combination of generative methods and discriminatively-trained approaches. To this end, we introduce the Discriminatively-guided Deliberative Perception (D2P) paradigm that has the following desirable properties: a) D2P is a single search algorithm that looks for the ‘best’ rendering of the scene that matches the input, b) can be guided by any and multiple discriminative algorithms, and c) generates a solution that is provably bounded suboptimal with respect to the chosen cost function. In addition, we introduce the notions of completeness and resolution completeness for multi-object pose estimation problems, and show that D2P is resolution complete. We conduct extensive evaluations on a benchmark dataset to study various aspects of D2P in relation to existing approaches.
BibTeX
@conference{Narayanan-2016-5537,author = {Venkatraman Narayanan and Maxim Likhachev},
title = {Discriminatively-guided Deliberative Perception for Pose Estimation of Multiple 3D Object Instances},
booktitle = {Proceedings of Robotics: Science and Systems (RSS '16)},
year = {2016},
month = {June},
}