PhD Thesis Proposal
November
1:00 pm to 12:00 am
Event Location: GHC 4405
Abstract: Visual Recognition has seen tremendous advances in the last decade. This progress is primarily due to learning algorithms trained with two key ingredients: large amounts of data and extensive supervision. While acquiring visual data is cheap, getting it labeled is far more expensive. So how do we enable learning algorithms to harness the sea of visual data available freely, without worrying about costly supervision?
Interestingly, our visual world is extraordinarily varied and complex, but despite its richness, the space of visual data may not be that astronomically large. We live in a well-structured, predictable world, where cars almost always drive on roads, sky is always above the ground, and so on; and these regularities can provide the missing ingredients required to scaling up our visual learning algorithms. This thesis aims to develop algorithms that: 1) discover this implicit and explicit structure in visual data, and 2) leverage the regularities to provide necessary constraints that facilitate large-scale visual learning. In particular, we propose a two-pronged strategy to enable large-scale recognition.
In Part I, we present algorithms for training better and more reliable supervised recognition models that exploit structure in various flavors of labeled data and target tasks. In Part II, we leverage these visual models and large amounts of unlabeled data to discover constraints, and use these constraints in a semi-supervised learning framework to improve visual recognition.
Committee:Abhinav Gupta, Chair
Alexei A. Efros, University of California, Berkeley
Martial Hebert
Deva Ramanan
Jitendra Malik, University of California, Berkeley