3:00 pm to 4:00 pm
Event Location: NSH 1507
Bio: I’m a sixth year PhD Student in the Machine Learning Department at CMU, working with Alyosha Efros and Abhinav Gupta. I graduated from CMU in 2010 with a B.S. in computer science and cognitive science, with a minor in neural computation, completing an undergraduate thesis with Tai Sing Lee. I’m interested in computer vision and all the learning problems that are associated with it. In particular, I’m interested weak label learning. In computer vision, the standard labels we use (e.g. bounding boxes, keypoint annotations) not only tend to be expensive to collect, but they also tend to be a poor approximation to what we actually know about images. Yet some types of labels come cheaply: for example, GPS tags, web text, and even raw image context. My work aims to show that these cues can provide roughly the same information as manually collected labels, and allow us to learn representations that are driven by the data, rather than by annotators. Thanks to Google for a Fellowship supporting my research.
Abstract: This work explores the use of spatial context as a source of free and plentiful supervisory signal for training a rich visual representation. Given only a large, unlabeled image collection, we extract random pairs of patches from each image and train a convolutional neural net to predict the position of the second patch relative to the first. We demonstrate that the learned representation is useful for unsupervised object discovery as well as learning from datasets where labels are scarce, including providing a significant boost over a randomly-initialized ConvNet on Pascal VOC 2007.