Exploring the Spatial Hierarchy of Mixture Models for Human Pose Estimation
Abstract
Human pose estimation requires a versatile yet well-constrained spatial model for grouping locally ambiguous parts together to produce a globally consistent hypothesis. Previous works either use local deformable models deviating from a certain template, or use a global mixture representation in the pose space. In this paper, we propose a new hierarchical spatial model that can capture an exponential number of poses with a compact mixture representation on each part. Using latent nodes, it can represent high-order spatial relationship among parts with exact inference. Different from recent hierarchical models that associate each latent node to a mixture of appearance templates (like HoG), we use the hierarchical structure as a pure spatial prior avoiding the large and often confounding appearance space. We verify the effectiveness of this model in three ways. First, samples representing human-like poses can be drawn from our model, showing its ability to capture high-order dependencies of parts. Second, our model achieves accurate reconstruction of unseen poses compared to a nearest neighbor pose representation. Finally, our model achieves state-of-art performance on three challenging datasets, and substantially outperforms recent hierarchical models.
BibTeX
@conference{Tian-2012-120323,author = {Y. Tian and L. Zitnick and S. G. Narasimhan},
title = {Exploring the Spatial Hierarchy of Mixture Models for Human Pose Estimation},
booktitle = {Proceedings of (ECCV) European Conference on Computer Vision},
year = {2012},
month = {October},
pages = {256 - 269},
}