Jointly Aligning and Segmenting Multiple Web Photo Streams for the Inference of Collective Photo Storylines - Robotics Institute Carnegie Mellon University
Loading Events

VASC Seminar

June

19
Wed
Gunhee Kim PhD Candidate CMU
Wednesday, June 19
4:00 pm to 4:30 pm
Jointly Aligning and Segmenting Multiple Web Photo Streams for the Inference of Collective Photo Storylines

Event Location: NSH 1507
Bio: Gunhee Kim is a PhD candidate advised by Eric P. Xing at Computer Science Department of Carnegie Mellon University. Prior to starting PhD study in 2009, he earned a master’s degree under supervision of Martial Hebert in Robotics Institute, CMU. He also worked as a visiting student in Antonio Torralba’s group at CSAIL, MIT and Fei-Fei Li’s group at Stanford University. His research interests are solving computer vision and web mining problems that emerge from big image data shared online, by developing scalable and effective machine learning and optimization techniques.

Abstract: With an explosion of popularity of online photo sharing, we can trivially collect a huge number of photo streams for any interesting topics such as scuba diving as an outdoor recreational activity class. Obviously, the retrieved photo streams are neither aligned nor calibrated since they are taken in different temporal, spatial, and personal perspectives. However, at the same time, they are likely to share common storylines that consist of sequences of events and activities frequently recurred within the topic. In this paper, as a first technical step to detect such collective storylines, we propose an approach to jointly aligning and segmenting uncalibrated multiple photo streams. The alignment task discovers the matched images between different photo streams, and the image segmentation task parses each image into multiple meaningful regions to facilitate the image understanding. We close a loop between the two tasks so that solving one task helps enhance the performance of the other in a mutually rewarding way. To this end, we design a scalable message-passing based optimization framework to jointly achieve both tasks for the whole input image set at once. With evaluation on the new Flickr dataset of 15 outdoor activities that consist of 1.5 millions of images of 13 thousands of photo streams, our empirical results show that the proposed algorithms are more successful than other candidate methods for both tasks.