Analysis by Synthesis for Modern Computer Vision - Robotics Institute Carnegie Mellon University

Analysis by Synthesis for Modern Computer Vision

PhD Thesis, Tech. Report, CMU-RI-TR-24-55, August, 2024

Abstract

Image denoising, depth completion, scene flow, and dynamic 3D reconstruction are all examples of recovery problems: the estimation of multidimensional signals from corrupted or partial measurements. This thesis examines these problems from the classic analysis-by-synthesis perspective, where a signal model is used to propose hypotheses, which are then compared to observations. This paradigm has fallen out of favor with the rise of feed-forward neural networks, but we claim that analysis by synthesis still has much to offer. Specifically, we argue it gives us a general framework for combining modern learning-based approaches with knowledge of forward models and intuitive priors.

First, we will discuss the typical feed-forward setting where one has a dataset of paired measurements and clean signals. In this setting, we show how embedding an analysis by synthesis optimization within the learning process can help us enforce constraints and generalize to new forward models. Second, we will focus on the self-supervised setting in the context of scene-flow estimation, a task where we only have indirect measurements (sequences of point clouds) of the signal of interest (motion). In this case, we will see how a test-time optimization can create a learning target for a feed-forward network that can then be scaled to large unlabeled datasets. Finally, we will examine a problem of estimating multiple signals simultaneously from measurements: the recovery of geometry and motion from sequences of point clouds. Here, instead of embedding an optimization into the learning process, we do the reverse. We show how a global analysis-by-synthesis objective can be broken down into components appropriate for off-the-shelf models. In each of these problems, we will see that analysis by synthesis offers us a powerful and flexible paradigm for structuring our approaches and injecting learning in the right places.

BibTeX

@phdthesis{Chodosh-2024-142566,
author = {Nathaniel Chodosh},
title = {Analysis by Synthesis for Modern Computer Vision},
year = {2024},
month = {August},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-24-55},
keywords = {Compressed Sensing, Analysis by Synthesis, 3D Reconstruction, 3D Computer Vision},
}