Visual Object Detection with Deformable Part Models
Abstract
We describe a state-of-the-art system for finding objects in cluttered images. Our system is based on deformable models that represent objects using local part templates and geometric constraints on the locations of parts. We reduce object detection to classification with latent variables. The latent variables introduce invariances that make it possible to detect objects with highly variable appearance. We use a generalization of support vector machines to incorporate latent information during training. This has led to a general framework for discriminative training of classifiers with latent variables. Discriminative training benefits from large training datasets. In practice we use an iterative algorithm that alternates between estimating latent values for positive examples and solving a large convex optimization problem. Practical optimization of this large convex problem can be done using active set techniques for adaptive subsampling of the training data.
BibTeX
@article{Felzenszwalb-2013-121107,author = {Pedro Felzenszwalb and Ross Girshick and David McAllester and Deva Ramanan},
title = {Visual Object Detection with Deformable Part Models},
journal = {Communications of the ACM},
year = {2013},
month = {September},
volume = {56},
number = {9},
pages = {97 - 105},
}