Single-Network Whole-Body Pose Estimation - Robotics Institute Carnegie Mellon University

Single-Network Whole-Body Pose Estimation

Conference Paper, Proceedings of (ICCV) International Conference on Computer Vision, pp. 6981 - 6990, October, 2019

Abstract

We present the first single-network approach for 2D whole-body pose estimation, which entails simultaneous localization of body, face, hands, and feet keypoints. Due to the bottom-up formulation, our method maintains constant real-time performance regardless of the number of people in the image. The network is trained in a single stage using multi-task learning, through an improved architecture which can handle scale differences between body/foot and face/hand keypoints. Our approach considerably improves upon OpenPose [??], the only work so far capable of whole-body pose estimation, both in terms of speed and global accuracy. Unlike OpenPose, our method does not need to run an additional network for each hand and face candidate, making it substantially faster for multi-person scenarios. This work directly results in a reduction of computational complexity for applications that require 2D whole-body information (eg, VR/AR, re-targeting). In addition, it yields higher accuracy, especially for occluded, blurry, and low resolution faces and hands. For code, trained models, and validation benchmarks, visit our project page: https://github. com/CMU-Perceptual-Computing-Lab/openpose_train.

BibTeX

@conference{Hidalgo-2019-122165,
author = {Gines Hidalgo Martinez and Yaadhav Raaj and Haroon Idrees and Donglai Xiang and Hanbyul Joo and Tomas Simon and Yaser Sheikh},
title = {Single-Network Whole-Body Pose Estimation},
booktitle = {Proceedings of (ICCV) International Conference on Computer Vision},
year = {2019},
month = {October},
pages = {6981 - 6990},
}