Learning Visual, Audio, and Cross-Modal Correspondences
Abstract: Today's machine perception systems rely heavily on supervision provided by humans, such as labels and natural language. I will talk about our efforts to make systems that, instead, learn from two ubiquitous sources of unlabeled data: visual motion and cross-modal sensory associations. I will begin by discussing our work on creating unified models for [...]
Impulse considerations for reasoning about intermittent contacts
Abstract: Many of our interactions with the environment involve making and breaking contacts. However, it is not always obvious how one should reason about these intermittent contacts (sequence, timings, locations) in an online and adaptive way. This is particularly relevant in gait generation for legged locomotion control, where it is standard to simply predefine and [...]
Multi-Human 3D Reconstruction from Monocular RGB Videos
Abstract: We study the problem of multi-human 3D reconstruction from RGB videos captured in the wild. Humans have dynamic motion, and reconstructing them in arbitrary settings is key to building immersive social telepresence, assistive humanoid robots, and augmented reality systems. However, creating such a system requires addressing fundamental issues with previous works regarding the data [...]
Learning and Translating Temporal Abstractions across Humans and Robots
Abstract: Humans possess a remarkable ability to learn to perform tasks from a variety of different sources-from language, instructions, demonstration, etc. In each case, they are able to easily extract the high-level strategy to solve the task, such as the recipe of cooking a dish, whilst ignoring irrelevant details, such as the precise shape of [...]
Robust Incremental Smoothing and Mapping
Abstract: In this work we present a method for robust optimization for online incremental Simultaneous Localization and Mapping (SLAM). Due to the NP-Hardness of data association in the presence of perceptual aliasing, tractable (approximate) approaches to data association will produce erroneous measurements. We require SLAM back-ends that can converge to accurate solutions in the presence [...]
Carnegie Mellon University
3D Reconstruction using Differential Imaging
Abstract: 3D reconstruction has been at the core of many computer vision applications, including autonomous driving, visual inspection in manufacturing, and augmented and virtual reality (AR/VR). Because monocular 3D sensing is fundamentally ill-posed, many techniques aiming for accurate reconstruction use multiple captures to solve the inverse problem. Depending on the amount of change in these [...]
Learning with Structured Priors for Robust Robot Manipulation
Abstract: Robust and generalizable robots that can autonomously manipulate objects in semi-structured environments can bring material benefits to society. Data-driven learning approaches are crucial for enabling such systems by identifying and exploiting patterns in semi-structured environments, allowing robots to adapt to novel scenarios with minimal human supervision. However, despite significant prior work in learning for [...]
Learning Parameter-Efficient Quadrotor Dynamics Models
Abstract: Operation of quadrotors through high-speed, high-acceleration maneuvers remains a challenging problem due to the complex aerodynamics in this regime. While standard physical models suffice for control in near-hover conditions, the primary challenge in executing aggressive trajectories is obtaining a model for the quadrotor dynamics that adequately models the aerodynamic effects present, including lift, drag, [...]
RI Faculty Business Meeting
Meeting for RI Faculty. Discussions include various department topics, policies, and procedures. Generally meets weekly.
Carnegie Mellon University
Self-Supervising Occlusions For Vision
Abstract: Virtually every scene has occlusions. Even a scene with a single object exhibits self-occlusions - a camera can only view one side of an object (left or right, front or back), or part of the object is outside the field of view. More complex occlusions occur when one or more objects block part(s) of [...]