Carnegie Mellon University
Abstract:
Factor graphs offer a flexible and powerful framework for solving large-scale, nonlinear inference problems as encountered in robot perception and control. Typically, these methods rely on handcrafted models that are efficient to optimize. However, robots often perceive the world through complex, high-dimensional sensor observations. For instance, consider a robot manipulating an object in hand and receiving high-dimensional tactile observations from which it must infer latent object poses. How do we couple machine learning to extract salient information from observations and graph optimization to efficiently fuse such information?
In this thesis, we address three principal challenges: (1) How do we learn observation models from data with optimizers in the loop? We show that learning observation models can be viewed as shaping energy functions that graph optimizers, even non-differentiable ones, optimize. (2) How do we impose hard constraints in graph optimizers derived from real-world physics or geometry? We expand incremental Gauss-Newton solvers into a broader primal-dual framework to efficiently solve for constraints in an online manner. (3) Finally, we look at different learned feature representations that extract salient information from tactile image observations.
We evaluate these approaches on a real-world application of tactile perception for robot manipulation where we demonstrate reliable object tracking in hundreds of trials across planar pushing and in-hand manipulation tasks. This thesis establishes novel connections between factor graph inference, constrained optimization, and energy-based learning, opening avenues for new research problems at the intersection of these topics.
Thesis Committee Members:
Michael Kaess, Chair
David Wettergreen
Oliver Kroemer
Stuart Anderson, Meta AI Research
John Leonard, MIT