3:00 pm to 4:00 pm
Event Location: NSH 1507
Bio: C. Lawrence Zitnick is a senior researcher in the Interactive Visual Media group at Microsoft Research, and is an affiliate associate professor at the University of Washington. He is interested in a broad range of topics related to object recognition, the semantic interpretation of visual scenes, and methods for gathering commonsense knowledge. He developed the PhotoDNA technology used by Microsoft, Facebook, Google, and various law enforcement agencies to combat illegal imagery on the web. Before joining MSR, he received the PhD degree in robotics from Carnegie Mellon University in 2003.
Abstract: The recent significant advances in computer vision, natural language processing and other related areas has led to a renewed interest in artificial intelligence applications spanning multiple domains. In this talk, I explore the relation between computer vision, language and commonsense reasoning through the application of image caption generation. Specifically, I describe new approaches for generating captions using recurrent neural networks, and the use of abstract scenes for gathering commonsense and semantic knowledge. The limitations of current approaches and the challenges that lie ahead are both emphasized.