Efficient Multi-View Object Recognition and Full Pose Estimation

Alvaro Collet Romea and Siddhartha Srinivasa

Conference Paper, Proceedings of (ICRA) International Conference on Robotics and Automation, pp. 2050 - 2055, May, 2010

View Publication

Abstract

We present an approach for efﬁciently recognizing all objects in a scene and estimating their full pose from multiple views. Our approach builds upon a state of the art single-view algorithm which recognizes and registers learned metric 3D models using local descriptors. We extend to multiple views using a novel multi-step optimization that processes each view individually and feeds consistent hypotheses back to the algorithm for global reﬁnement. We demonstrate that our method produces results comparable to the theoretical optimum, a full multi-view generalized camera approach, while avoiding its combinatorial time complexity. We provide experimental results demonstrating pose accuracy, speed, and robustness to model error using a three-camera rig, as well as a physical implementation of the pose output being used by an autonomous robot executing grasps in highly cluttered scenes.

Notes
Please see the accompanying video at http://www.youtube.com/watch?v=ZNHRH00UMvk

BibTeX

@conference{Romea-2010-10439,
author = {Alvaro Collet Romea and Siddhartha Srinivasa},
title = {Efficient Multi-View Object Recognition and Full Pose Estimation},
booktitle = {Proceedings of (ICRA) International Conference on Robotics and Automation},
year = {2010},
month = {May},
pages = {2050 - 2055},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.