Understanding Everyday Hands in Action from RGB-D Images

G. Rogez, J. Supancic, and D. Ramanan

Conference Paper, Proceedings of (ICCV) International Conference on Computer Vision, pp. 3889 - 3897, December, 2015

Abstract

We analyze functional manipulations of handheld objects, formalizing the problem as one of fine-grained grasp classification. To do so, we make use of a recently developed fine-grained taxonomy of human-object grasps. We introduce a large dataset of 12000 RGB-D images covering 71 everyday grasps in natural interactions. Our dataset is different from past work (typically addressed from a robotics perspective) in terms of its scale, diversity, and combination of RGB and depth data. From a computer-vision perspective, our dataset allows for exploration of contact and force prediction (crucial concepts in functional grasp analysis) from perceptual cues. We present extensive experimental results with state-of-the-art baselines, illustrating the role of segmentation, object context, and 3D-understanding in functional grasp analysis. We demonstrate a near 2X improvement over prior work and a naive deep baseline, while pointing out important directions for improvement.

BibTeX

@conference{Rogez-2015-121185,
author = {G. Rogez and J. Supancic and D. Ramanan},
title = {Understanding Everyday Hands in Action from RGB-D Images},
booktitle = {Proceedings of (ICCV) International Conference on Computer Vision},
year = {2015},
month = {December},
pages = {3889 - 3897},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.