Visual Representation and Correlation Measurement - Robotics Institute Carnegie Mellon University
Loading Events

VASC Seminar

January

23
Mon
Lei Wu Postdoc University of Pittsburgh
Monday, January 23
3:00 pm to 12:00 am
Visual Representation and Correlation Measurement

Event Location: NSH 1507
Bio: Dr. Lei Wu is currently a Post-doctoral Research Associate in Dept. of Computer Science at University of Pittsburgh. He received Ph.D. in Dept. of Electronic Engineering and Information Science and B.S. degree in Special Class for Gifted Young (SCGY) from University of Science and Technology of China. His research interests include distance metric learning, multimedia retrieval, and object recognition. He has published 20 international papers, among which 17 are first author papers, 6 top-tier journal papers, 12 top-tier conference papers, one best poster award and one best paper candidate. The total citation for all his papers is 246 times. He has also filed four US patents, one of the patents received Microsoft patent award in 2009. Dr. Lei Wu received Microsoft Fellowship 2007, which was only granted to around 20 researchers in the Asia-Pacific area in 2007. Dr. Lei Wu also received the President Special Scholarship of Chinese Academy of Science in 2010, which is the highest honor granted by Chinese Academy of Science. Each year only top 20 Ph.D. from all research fields can receive this special honor. In 2010, only 3 Ph.D. in information science received this award. Dr. Lei Wu’s PhD Thesis is honored Outstanding PhD Thesis by Chinese Academy of Science in 2011. Only top 100 thesis will receive such award. Dr. Lei Wu has served as a technical Editor for International Journal of Digital Content Technology of its Application, Leading guest editor at SI in Advances in Multimedia, Program Committee member at AAAI 2012, IJCAI 2011, IEEE ICIP 2011, ACM SIGMAP 2011-2012, etc. He also served as a technical reviewer for multiple international conferences and journals.

Abstract: The first part of the talk presents semantic preserving bag-of-words model, an improved Bag-of-Words (BoW) model with minimized semantic loss. One of the critical limitations of existing BoW models is the semantic gap problem, that distance between two visual features in Euclidean space does not necessarily reflect the semantic distance. This talk introduces a novel scheme for learning a codebook such that semantic loss is minimized.

The second part of the talk presents Flickr distance, which is a novel measurement of the relationship between semantic concepts (objects, scenes) in visual domain. Each concept is modeled as a latent topic visual language model, and Flickr distance between different concepts is defined over the divergence among these visual language models. Comparing with WordNet, Flickr distance is able to handle far more concepts existing on the Web, and it can scale up with the increase of concept vocabularies. Comparing with Google distance, which is generated in textual domain, Flickr distance is more precise for visual domain concepts, as it captures the visual relationship between the concepts instead of their co-occurrence in text search results. Besides, unlike Google distance, Flickr distance satisfies triangular inequality, which makes it a more reasonable distance metric. Both subjective user study and objective evaluation show that Flickr distance is more coherent to human perception than Google distance.