3:00 pm to 4:00 pm
Event Location: NSH 1507
Bio: I am a PhD student at Language Technology Institute, Carnegie Mellon University, from fall 2012. I am working with Prof. Abhinav Gupta on joint learning with language and vision and life-long learning. I am also working with Prof. Tom Mitchell in CMU. Recently, I just finished my internship in MSR with Prof. C. Lawrence Zitnick. Previously, I graduated with a bachelor’s degree in computer science from Zhejiang University, China. During my undergraduate study, I was mainly under the supervision of Prof. Deng Cai in the State Key Laboratory of CAD & CG. I was a summer intern at UCLA in 2011, mainly work with Prof. Jenn Wortman Vaughan.
Abstract: We present an approach to utilize large amounts of web data for learning ConvNets. Specifically inspired by curriculum learning, we present a two-step approach for CNN training. We demonstrate that the visual features extracted from our network perform favorably to those pretrained on ImageNet. We also show the webly supervised network can be used to localize objects in noisy web images, which achieves the best performance on the VOC 2007 object detection challenge where no VOC training data is used.