3:30 pm to 4:30 pm
3305 Newell-Simon Hall
Abstract: Deep learning has revolutionized all aspects of computer vision, but its successes have come from supervised learning at scale: large models trained on ever larger labeled datasets. However this reliance on labels makes these systems fragile when it comes to new scenarios or new tasks where labels are unavailable. This is in stark contrast to humans whose visual system seems to develop with few if any labels, and who have little trouble adapting to new situations.
In this talk, I will argue that the reliance on large labeled datasets arises because we see each perceptual task as an isolated, traditional machine learning problem, rather than as different facets of a unified perceptual understanding problem: that of modeling, recognizing and predicting a single underlying physical world. I will show that this perspective allows us to perform seemingly impossible feats: building a recognition system with no labels at all, and drawing correspondences between disparate objects without a single correspondence label. I will end with some work on how we can model the physical world better, in the hope of yielding more fine-grained, robust and general computer vision systems.
Bio: Bharath Hariharan is an assistant professor at Cornell University. He works on problems in computer vision and machine learning that defy the big data label. He did his PhD at University of California, Berkeley with Jitendra Malik. His work has been recognized with an NSF CAREER and a PAMI Young Researcher Award.
Homepage: https://www.cs.cornell.edu/~bharathh/
Sponsored in part by: Meta Reality Labs Pittsburgh