Video Compression for Recognition & Video Recognition for Compression - Robotics Institute Carnegie Mellon University
Loading Events

VASC Seminar

November

5
Mon
Philipp Krähenbühl Professor Computer Science Department, University of Texas at Austin
Monday, November 5
2:30 pm to 3:30 pm
GHC 6501
Video Compression for Recognition & Video Recognition for Compression

Abstract: Training robust deep video representations has proven to be much more challenging than learning deep image representations. One reason is: videos are huge and highly redundant. The ‘true’ and interesting signal often drowns in too much irrelevant data. In the first part of the talk, I will show how to train a deep network directly on a compressed video (like H.264, HEVC, etc.), devoid of redundancy, rather than the traditional highly redundant RGB stream. We show that it is not only faster but also more accurate than traditional methods.
In the second part of the talk, we will flip things around. I will present an end-to-end deep learning video codec. Our codec builds on one simple idea: Video compression is repeated image interpolation. It thus benefits from recent advances in deep image interpolation and generation. Our deep video codec outperforms today’s prevailing codecs, such as H.261, MPEG4 Part 2, and performs on par with H.264 and H.265.

Bio: Philipp is an Assistant Professor in the Department of Computer Science at the University of Texas at Austin. He received my PhD in 2014 from the CS Department at Stanford University and then spent two wonderful years as a PostDoc at UC Berkeley. His research interests lie in Computer Vision, Machine learning and Computer Graphics. He is particularly interested in deep learning, as well as image segmentation and understanding.

Homepage: http://www.philkr.net/