Resource-constrained learning and inference for visual perception - Robotics Institute Carnegie Mellon University
Loading Events

PhD Speaking Qualifier

March

26
Thu
Mengtian Li Robotics Institute,
Carnegie Mellon University
Thursday, March 26
9:00 am to 10:00 am
Resource-constrained learning and inference for visual perception

Zoom Link

Abstract
Real-world applications usually require computer vision algorithms to meet certain resource constraints. In this talk, I will present evaluation methods and principled solutions for both cases of training and testing.

First, I will talk about a formal setting for studying training under the non-asymptotic, resource-constrained regime, i.e., budgeted training. We analyze the following problem: “given a dataset, algorithm, and fixed resource budget, what is the best achievable performance?” We focus on the number of optimization iterations as the representative resource. Under such a setting, we show that it is critical to adjust the learning rate schedule according to the given budget.

Second, I will talk about how vision algorithms should respond to resource constraints inherent in embodied perception, where an autonomous agent needs to perceive its environment and (re)act in time. Our key observation is that by the time an algorithm finishes processing a particular image frame, the surrounding world has changed. To help explore vision in this streaming context, we introduce a meta-benchmark that systematically converts any image understanding task into a streaming image understanding task. Our proposed solutions and their empirical analysis demonstrate a number of surprising conclusions: (1) the tradeoff between accuracy versus latency can now be measured quantitatively and there exists an optimal “sweet spot” that maximizes streaming accuracy, (2) asynchronous tracking and future forecasting naturally emerge as internal representations that enable streaming image understanding, and (3) dynamic scheduling can be used to overcome temporal aliasing, yielding the paradoxical result that latency is sometimes minimized by sitting idle and “doing nothing”.

Committee
Deva Ramanan
David Held
Kris Kitani
Aayush Bansal