Bottom-Up and Top-Down Reasoning with Hierarchical Rectified Gaussians

Conference Paper, Proceedings of (CVPR) Computer Vision and Pattern Recognition, pp. 5600 - 5609, June, 2016

Abstract

Convolutional neural nets (CNNs) have demonstrated remarkable performance in recent history. Such approaches tend to work in a "unidirectional" bottom-up feed-forward fashion. However, practical experience and biological evidence tells us that feedback plays a crucial role, particularly for detailed spatial understanding tasks. This work explores "bidirectional" architectures that also reason with top-down feedback: neural units are influenced by both lower and higher-level units. We do so by treating units as rectified latent variables in a quadratic energy function, which can be seen as a hierarchical Rectified Gaussian model (RGs) [39]. We show that RGs can be optimized with a quadratic program (QP), that can in turn be optimized with a recurrent neural network (with rectified linear units). This allows RGs to be trained with GPU-optimized gradient descent. From a theoretical perspective, RGs help establish a connection between CNNs and hierarchical probabilistic models. From a practical perspective, RGs are well suited for detailed spatial tasks that can benefit from top-down reasoning. We illustrate them on the challenging task of keypoint localization under occlusions, where local bottom-up evidence may be misleading. We demonstrate state-of-the-art results on challenging benchmarks.

BibTeX

@conference{Hu-2016-121180,
author = {P. Hu and D. Ramanan},
title = {Bottom-Up and Top-Down Reasoning with Hierarchical Rectified Gaussians},
booktitle = {Proceedings of (CVPR) Computer Vision and Pattern Recognition},
year = {2016},
month = {June},
pages = {5600 - 5609},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.