9:30 am to 12:00 am
Event Location: NSH 3305
Abstract: Object recognition is one of the fundamental challenges in computer vision, where the goal is to identify and localize the extent of object instances within an image. The current de facto standard for building high-performance object category detectors is the sliding window approach. This approach involves scanning an image with a fixed-size rectangular window and applying a classifier to the features extracted within the sub-image defined by the window. In this thesis, we study two important factors influencing the performance of the approach. First is the role played by context, where information outside the sliding window is used to rescore the detections output by the local window classifier. Context helps in suppressing detections in regions that are less probable to contain an object and encourages those that are more plausible. In the first part of this thesis, we enumerate different sources and uses of context, and comprehensively evaluate their role in a benchmark detection challenge. Our analysis demonstrates that carefully used contextual cues can not only make a good local window classifier perform even better, but also change the typical error patterns of the local classifier to more meaningful and reasonable errors. Our analysis also provides a basis for assessing the inherent limitations of the existing approaches and also the specific problems that remain unsolved. Second is the role played by subcategories, where information within the sliding window is used to split the training data into smaller groups, for learning multiple classifiers to model the appearance of an object. The smaller groups have reduced appearance diversity and thus lead to simpler classification problems. In the second part of this thesis, we analyze different schemes to generate subcategories, and find that unsupervised feature-space clustering produces well-performing subcategory classifiers. Beyond performance gains, subcategories are attractive for their conceptual simplicity and computational friendliness. For example, we find that careful use of subcategories can potentially replace the need for deformable parts within the state-of-the-art deformable parts model detector for many object categories. Data fragmentation is an important problem associated with subcategory-based methods. We present a novel approach that circumvents this problem by allowing different subcategories to share each other’s training instances.
Committee:Martial Hebert, Co-chair
Alexei A. Efros, Co-chair
Takeo Kanade
Deva Ramanan, University of California at Irvine