Loading Events

PhD Thesis Proposal

November

18
Fri
Abhinav Shrivastava Carnegie Mellon University
Friday, November 18
1:00 pm to 12:00 am
Discovering and Leveraging Visual Structure for Large-scale Recognition

Event Location: GHC 4405

Abstract: Visual Recognition has seen tremendous advances in the last decade. This progress is primarily due to learning algorithms trained with two key ingredients: large amounts of data and extensive supervision. While acquiring visual data is cheap, getting it labeled is far more expensive. So how do we enable learning algorithms to harness the sea of visual data available freely, without worrying about costly supervision?

Interestingly, our visual world is extraordinarily varied and complex, but despite its richness, the space of visual data may not be that astronomically large. We live in a well-structured, predictable world, where cars almost always drive on roads, sky is always above the ground, and so on; and these regularities can provide the missing ingredients required to scaling up our visual learning algorithms. This thesis aims to develop algorithms that: 1) discover this implicit and explicit structure in visual data, and 2) leverage the regularities to provide necessary constraints that facilitate large-scale visual learning. In particular, we propose a two-pronged strategy to enable large-scale recognition.

In Part I, we present algorithms for training better and more reliable supervised recognition models that exploit structure in various flavors of labeled data and target tasks. In Part II, we leverage these visual models and large amounts of unlabeled data to discover constraints, and use these constraints in a semi-supervised learning framework to improve visual recognition.

Committee:Abhinav Gupta, Chair

Alexei A. Efros, University of California, Berkeley

Martial Hebert

Deva Ramanan

Jitendra Malik, University of California, Berkeley