How Do We Use Our Hands? Discovering a Diverse Set of Common Grasps
Abstract
Our aim is to show how state-of-the-art computer vision techniques can be used to advance prehensile analysis (i.e., understanding the functionality of human hands). Prehen- sile analysis is a broad field of multi-disciplinary interest, where researchers painstakingly manually analyze hours of hand-object interaction videos to understand the mechan- ics of hand manipulation. In this work, we present promis- ing empirical results indicating that wearable cameras and unsupervised clustering techniques can be used to automat- ically discover common modes of human hand use. In par- ticular, we use a first-person point-of-view camera to record common manipulation tasks and leverage its strengths for reliably observing human hand use. To learn a diverse set of hand-object interactions, we propose a fast online clus- tering algorithm based on the Determinantal Point Process (DPP). Furthermore, we develop a hierarchical extension to the DPP clustering algorithm and show that it can be used to discover appearance-based grasp taxonomies. Us- ing a purely data-driven approach, our proposed algorithm is able to obtain hand grasp taxonomies that roughly corre- spond to the classic Cutkosky grasp taxonomy. We validate our approach on over 10 hours of first-person point-of-view videos in both choreographed and real-life scenarios.
BibTeX
@conference{Huang-2015-5984,author = {De-An Huang and Minghuang Ma and Wei-Chiu Ma and Kris M. Kitani},
title = {How Do We Use Our Hands? Discovering a Diverse Set of Common Grasps},
booktitle = {Proceedings of (CVPR) Computer Vision and Pattern Recognition},
year = {2015},
month = {June},
pages = {666 - 675},
}