3:00 pm to 4:00 pm
Event Location: NSH 1507
Bio: Xinlei Chen is a PhD student in the Language Technologies Institute at Carnegie Mellon University, where he is supervised by Abhinav Gupta. He holds an Bachelor’s degree in Computer Science from Zhejiang University, China. His research focuses on the intersection of computer vision and natural language processing and he is particularly interested in data-driven algorithms for life-long learning.
Abstract: Recent success in representation learning has triggered an explosion in papers that explore the bi-directional mapping between images and their sentence-based descriptions. Recurrent neural network is one popular way to stitch together words for sentence generation and to perform bi-directional image-text retrieval. My internship work follows the trend but different from standard RNNs, we use a novel recurrent visual memory that automatically learns to remember long-term visual concepts to aid in both sentence generation and visual feature reconstruction. This simple model achieves impressive results on many datasets even compared to human performance.