Visual Subcategories - Robotics Institute Carnegie Mellon University
Loading Events

PhD Thesis Proposal

April

18
Mon
Santosh Kumar Divvala Carnegie Mellon University
Monday, April 18
3:00 pm to 12:00 am
Visual Subcategories

Event Location: NSH 1305

Abstract: This thesis introduces the concept of visual subcategories. Many image understanding tasks such as object detection and image classification are formulated as binary classification problems, where the positive examples are instances (bounding boxes or images) of a specific object or scene category, and negative examples are background patches or images. Due to large intra-class variation in the appearance, pose, and camera viewpoint, it is difficult to learn a single classifier that can achieve good performance on a challenging testset. To relieve the classifier of this Sisyphean burden, many approaches have considered reorganizing the data into groups based on object semantics for training multiple classifiers. While some approaches have clustered the data using extra ground-truth annotations (e.g., viewpoint, which may not be available for many large datasets), others have used heuristics (e.g., aspect-ratio, that are brittle and fail to generalize to a large number of categories). There has been little effort in understanding the common insight shared behind their success — which is the idea of partitioning the data into `visually homogeneous’ clusters that aids in simplifying the learning task and results in better performing classifiers. Based on this critical insight, this thesis presents an approach based on the notion of visual subcategories and empirically demonstrates their utility for object recognition and image classification tasks. Increasing the number of subcategories leads to improved homogeneity at the cost of leaving very few samples per subcategory, which might be insufficient to learn a robust classifier. To cope with this problem, this thesis proposes a semi-supervised approach to leverage the gigantic collection of unlabeled images available on the web for populating the impoverished clusters.

Committee:Martial Hebert, Co-chair

Alexei A. Efros, Co-chair

Takeo Kanade

Deva Ramanan, University of California at Irvine