PhD Thesis Defense
Learning with Structured Priors for Robust Robot Manipulation
Abstract: Robust and generalizable robots that can autonomously manipulate objects in semi-structured environments can bring material benefits to society. Data-driven learning approaches are crucial for enabling such systems by identifying and exploiting patterns in semi-structured environments, allowing robots to adapt to novel scenarios with minimal human supervision. However, despite significant prior work in learning for [...]
Carnegie Mellon University
Self-Supervising Occlusions For Vision
Abstract: Virtually every scene has occlusions. Even a scene with a single object exhibits self-occlusions - a camera can only view one side of an object (left or right, front or back), or part of the object is outside the field of view. More complex occlusions occur when one or more objects block part(s) of [...]
Carnegie Mellon University
Learning with Diverse Forms of Imperfect and Indirect Supervision
Abstract: Powerful Machine Learning (ML) models trained on large, annotated datasets have driven impressive advances in fields including natural language processing and computer vision. In turn, such developments have led to impactful applications of ML in areas such as healthcare, e-commerce, and predictive maintenance. However, obtaining annotated datasets at the scale required for training high [...]
Computational Interferometric Imaging
Abstract: Imaging systems typically accumulate photons that, as they travel from a light source to a camera, follow multiple different paths and interact with several scene objects. This multi-path accumulation process confounds the information that is available in captured images about the scene and makes using these images to infer properties of scene objects, such [...]
Neural Radiance Fields with LiDAR Maps
Abstract: Maps, as our prior understanding of the environment, play an essential role for many modern robotic applications. The design of maps, in fact, is a non-trivial art of balance between storage and richness. In this thesis, we explored map compression for image-to-LiDAR registration, LiDAR-to-LiDAR map registration, and image-to-SfM map registration, and finally, inspired by [...]
Carnegie Mellon University
System Identification and Control of Multiagent Systems Through Interactions
Abstract: This thesis investigates the problem of inferring the underlying dynamic model of individual agents of a multiagent system (MAS) and using these models to shape the MAS's behavior using robots extrinsic to the MAS. We investigate (a) how an observer can infer the latent task and inter-agent interaction constraints from the agents' motion and [...]
Carnegie Mellon University
Parallelized Search on Graphs with Expensive-to-Compute Edges
Abstract: Search-based planning algorithms enable robots to come up with well-reasoned long-horizon plans to achieve a given task objective. They formulate the problem as a shortest path problem on a graph embedded in the state space of the domain. Much research has been dedicated to achieving greater planning speeds to enable robots to respond quickly [...]
Carnegie Mellon University
Visual Dataset Pipeline: From Curation to Long-Tail Learning
Abstract: Computer vision models have proven to be tremendously capable of recognizing and detecting several real-world objects: cars, people, pets. These models are only possible due to a meticulous pipeline where a task and application is first conceived followed by an appropriate dataset curation that collects and labels all necessary data. Commonly, studies are focused [...]
Carnegie Mellon University
Optimization of Small Unmanned Ground Vehicle Design using Reconfigurability, Mobility, and Complexity
Abstract: Unmanned ground vehicles are being deployed in increasingly diverse and complex environments. With modern developments in sensing and planning, the field of ground vehicle mobility presents rich possibilities for mechanical innovations that may be especially relevant for unmanned systems. In particular, reconfigurability may enable vehicles to traverse a wider set of terrains with greater [...]
Carnegie Mellon University
Towards Reconstructing Non-rigidity from Single Camera
Abstract: In this talk we will discuss how to infer 3D from images captured by a single camera, without assuming the target scenes / objects being static. The non-static setting makes our problem ill-posed and challenging to solve, but is vital in practical applications where target-of-interest is non-static. To solve ill-posed problems, the current trend [...]
Large Scale Dense 3D Reconstruction via Sparse Representations
Abstract: Dense 3D scene reconstruction is in high demand today for view synthesis, navigation, and autonomous driving. A practical reconstruction system inputs multi-view scans of the target using RGB-D cameras, LiDARs, or monocular cameras, computes sensor poses, and outputs scene reconstructions. These algorithms are computationally expensive and memory-intensive due to the presence of 3D data. [...]
From Reinforcement Learning to Robot Learning: Leveraging Prior Data and Shared Evaluation
Abstract: Unlike most machine learning applications, robotics involves physical constraints that make off-the-shelf learning challenging. Difficulties in large-scale data collection and training present a major roadblock to applying today’s data-intensive algorithms. Robot learning has an additional roadblock in evaluation: every physical space is different, making results across labs inconsistent. Two common assumptions of the robot [...]
Building 4D Models of Objects and Scenes from Monocular Videos
Abstract: We explore how to infer the time-varying 3D structures of generic, deformable objects, and dynamic scenes from monocular videos. A solution to this problem is essential for virtual reality and robotics applications. However, inferring 4D structures given 2D observations is challenging due to its under-constrained nature. In a casual setup where there is neither [...]
Learning via Visual-Tactile Interaction
Abstract: Humans learn by interacting with their surroundings using all of their senses. The first of these senses to develop is touch, and it is the first way that young humans explore their environment, learn about objects, and tune their cost functions (via pain or treats). Yet, robots are often denied this highly informative and [...]
Redefining the Perception-Action Interface: Visual Action Representations for Contact-Centric Manipulation
Abstract: In robotics, understanding the link between perception and action is pivotal. Typically, perception systems process sensory data into state representations like segmentations and bounding boxes, which a planner uses to plan actions. However, this state estimation approach can fail in environments with partial observability, and in cases with challenging object properties like transparency and deformability. [...]
Multi-Human 3D Reconstruction from Monocular Videos
Abstract: We study the problem of multi-human 3D reconstruction from videos captured in the wild. Human movements are dynamic, and accurately reconstructing them in various settings is crucial for developing immersive social telepresence, assistive humanoid robots, and augmented reality systems. However, creating such a system requires addressing fundamental issues with previous works regarding the data [...]
How I Learned to Love Blobs: The Power of Gaussian Representations in Differentiable Rendering and Optimization
Abstract: In this thesis, we explore the use of Gaussian Representations in multiple application areas of computer vision and robotics. In particular, we design a ray-based differentiable renderer for 3D Gaussians that can be used to solve multiple classic computer vision problems in a unified manner. For example, we can reconstruct 3D shapes from color, [...]
Towards Photorealistic Dynamic Capture and Animation of Human Hair and Head
Abstract: Realistic human avatars play a key role in immersive virtual telepresence. To reach a high level of realism, a human avatar needs to faithfully reflect human appearance. A human avatar should also be drivable and express natural motions. Existing works have made significant progress in building drivable realistic face avatars, but they rarely include [...]
Modeling Dynamic Clothing for Data-Driven Photorealistic Avatars
Abstract: In this thesis, we aim to build photorealistic animatable avatars of humans wearing complex clothing in a data-driven manner. Such avatars will be a critical technology to enable future applications such as immersive telepresence in Virtual Reality (VR) and Augmented Reality (AR). Existing full-body avatars that jointly model geometry and view-dependent texture using Variational [...]
Manipulation Among Movable Objects for Pick-and-Place Tasks in Cluttered 3D Workspaces
Abstract: In cluttered real-world workspaces, simple pick-and-place tasks for robot manipulators can be quite challenging to solve. Often there is no collision-free trajectory that allows the robot to grasp and extract a desired object from the scene. This requires motion planning algorithms to reason about rearranging some of the “movable” clutter in the scene so [...]
Generalizable Dexterity with Reinforcement Learning
Abstract: Dexterity, the ability to perform complex interactions with the physical world, is at the core of robotics. However, existing research in robot manipulation has been focused on tasks that involve limited dexterity, such as pick-and-place. The motor skills of the robots are often quasi-static, have a predefined or limited sequence of contact events, and [...]
Sample-Efficient Reinforcement Learning with applications in Nuclear Fusion
Abstract: In many practical applications of reinforcement learning (RL), it is expensive to observe state transitions from the environment. In the problem of plasma control for nuclear fusion, the motivating example of this thesis, determining the next state for a given state-action pair requires querying an expensive transition function which can lead to many hours [...]
Social Navigation with Pedestrian Groups
Abstract: Autonomous navigation in human crowds (i.e., social navigation) presents several challenges: The robot often needs to rely on its noisy sensors to identify and localize pedestrians in human crowds; the robot needs to plan efficient paths to reach its goals; the robot needs to do so in a safe and socially appropriate manner. Recent [...]
Design Iteration of Dexterous Compliant Robotic Manipulators
Abstract: The goal of personal robotics is to have robots in homes performing everyday tasks efficiently to improve our quality of life. Towards this end, manipulators are needed which are low cost, safe around humans, and approach human-level dexterity. However, existing off-the-shelf manipulators are expensive both in cost and manufacturing time, difficult to repair, and [...]
Continual Learning of Compositional Skills for Robust Robot Manipulation
Abstract: Real world robots need to continuously learn new manipulation tasks in a lifelong learning manner. These new tasks often share many sub-structures e.g. sub-tasks, controllers, preconditions, with previously learned tasks. To utilize these shared sub-structures, we explore a compositional and object-centric approach to learn manipulation tasks. The first part of this thesis focuses on [...]
Watch, Practice, Improve: Towards In-the-wild Manipulation
Abstract: The longstanding dream of many roboticists is to see robots perform diverse tasks in diverse environments. To build such a robot that can operate anywhere, many methods train on robotic interaction data. While these approaches have led to significant advances, they rely on heavily engineered setups or high amounts of supervision, neither of which [...]
Improving the Transparency of Agent Decision Making to Humans Using Demonstrations
Abstract: For intelligent agents (e.g. robots) to be seamlessly integrated into human society, humans must be able to understand their decision making. For example, the decision making of autonomous cars must be clear to the engineers certifying their safety, passengers riding them, and nearby drivers negotiating the road simultaneously. As an agent's decision making depends [...]
Perception amidst interaction: spatial AI with vision and touch for robot manipulation
Abstract: Robots currently lack the cognition to replicate even a fraction of the tasks humans do, a trend summarized by Moravec's Paradox. Humans effortlessly combine their senses for everyday interactions—we can rummage through our pockets in search of our keys, and deftly insert them to unlock our front door. Before robots can demonstrate such dexterity, [...]
Sparse-view 3D in the Wild
Abstract: Reconstructing 3D scenes and objects from images alone has been a long-standing goal in computer vision. We have seen tremendous progress in recent years, capable of producing near photo-realistic renderings from any viewpoint. However, existing approaches generally rely on a large number of input images (typically 50-100) to compute camera poses and ensure view [...]
Offline Learning for Stochastic Multi-Agent Planning in Autonomous Driving
Abstract: Fully autonomous vehicles have the potential to greatly reduce vehicular accidents and revolutionize how people travel and how we transport goods. Many of the major challenges for autonomous driving systems emerge from the numerous traffic situations that require complex interactions with other agents. For the foreseeable future, autonomous vehicles will have to share the [...]
Improving Robot Capabilities Through Reconfigurability
Abstract: Advancements in robot capabilities are often achieved through integrating more hardware components. These hardware additions often lead to systems with high power consumption, fragility, and difficulties in control and maintenance. However, is this approach the only path to enhancing robot functionality? In this talk, I introduce the PuzzleBots, a modular multi-robot system with passive [...]
Carnegie Mellon University
Spectral Mapping using Simple Sensors
Abstract: Spectral mapping holds significant importance in many exploration endeavors as it facilitates a deeper comprehension of material composition within a surveyed area. While imaging spectrometers excel in recording reflectance spectra into spectral maps, their large physical footprint, substantial power requirements, and operational intricacies render them unsuitable for integration into small rovers or resource-constrained missions. [...]
Causal Robot Learning for Manipulation
Abstract: Two decades into the third age of AI, the rise of deep learning has yielded two seemingly disparate realities. In one, massive accomplishments have been achieved in deep reinforcement learning, protein folding, and large language models. Yet, in the other, the promises of deep learning to empower robots that operate robustly in real-world environments [...]
Learning to Manipulate Using Diverse Datasets
Abstract: Autonomous agents can play games (like Chess, Go, and even Starcraft), they can help make complex scientific predictions (e.g., protein folding), and they can even write entire computer programs, with just a bit of prompting. However, even the most basic physical manipulation skills, like unlocking and opening a door, still remain literally out-of-reach. The [...]
Plan to Learn: Active Robot Learning by Planning
Abstract: Robots need a diverse repertoire of capable motor skills to succeed in the open world. Such a skillset cannot be learned or designed purely on human initiative. In this thesis, we advocate for an active continual learning approach that enables robots to take charge of their own learning. The goal of an autonomously learning [...]
Policy Decomposition
Abstract: Optimal Control is a popular formulation for designing controllers for dynamic robotic systems. Under the formulation, the desired long-term behavior of the system is encoded via a cost function and the policy, i.e. a mapping from the state of the system to control commands, to achieve the desired behavior are obtained by solving an [...]
Analysis by Synthesis for Modern Computer Vision
Abstract: Image denoising, depth completion, scene flow, and dynamic 3D reconstruction are all examples of recovery problems: the estimation of multidimensional signals from corrupted or partial measurements. This thesis examines these problems from the classic analysis-by-synthesis perspective, where a signal model is used to propose hypotheses, which are then compared to observations. This paradigm has [...]
A Modularized Approach to Vision-based Tactile Sensor Design Using Physics-based Rendering
Abstract: Touch is an essential sensing modality for making autonomous robots more dexterous and allowing them to work collaboratively with humans. In particular, the advent of vision-based tactile sensors has resulted in efforts to design them for different robotic manipulation tasks. However, this design task remains a challenging problem. This is for two reasons: first, [...]
Interleaving Discrete Search and Continuous Optimization for Kinodynamic Motion Planning
Abstract: Motion planning for dynamically complex robotic tasks requires explicit reasoning within constraints on velocity, acceleration, force/torque, and kinematics such as avoiding obstacles. To meet these constraints, planning algorithms must simultaneously make high-level discrete decisions and low-level continuous decisions. For example, pushing a heavy object involves making discrete decisions about contact locations and continuous decisions [...]
Goal-Expressive Movement for Social Navigation: Where and When to Behave Legibly
Abstract: Robots often need to communicate their navigation goals to assist observers in anticipating the robot's future actions. Enabling observers to infer where a robot is going from its movements is particularly important as robots begin to share workplaces, sidewalks, and social spaces with humans. We can use legible motion, or movements that use intentional [...]
Eye Gaze for Intelligent Driving
Abstract: Intelligent vehicles have been proposed as one path to increasing traffic safety and reducing on-road crashes. Driving “intelligence” today takes many forms, ranging from simple blind spot occupancy or forward collision warnings to distance-aware cruise and all the way to full driving autonomy in certain situations. Primarily, these methods are outward-facing and operate on [...]
Learning to Perceive and Predict Everyday Interactions
Abstract: This thesis aims to build computer systems to understand everyday hand-object interactions in the physical world – both perceiving ongoing interactions in 3D space and predicting possible interactions. This ability is crucial for applications such as virtual reality, robotic manipulations, and augmented reality. The problem is inherently ill-posed due to the challenges of one-to-many [...]
Deep Learning for Tactile Sensing: Development to Deployment
Abstract: The role of sensing is widely acknowledged for robots interacting with the physical environment. However, few contemporary sensors have gained widespread use among roboticists. This thesis proposes a framework for incorporating sensors into a robot learning paradigm, from development to deployment, through the lens of ReSkin -- a versatile and scalable magnetic tactile sensor. [...]
Learning and Translating Temporal Abstractions of Behaviour across Humans and Robots
Abstract: Humans are remarkably adept at learning to perform tasks by imitating other people demonstrating these tasks. Key to this is our ability to reason abstractly about the high-level strategy of the task at hand (such as the recipe of cooking a dish) and the behaviours needed to solve this task (such as the behaviour [...]
Assistive value alignment using in-situ naturalistic human behaviors
Abstract: As collaborative robots are increasingly deployed in personal environments, such as the home, it is critical they take actions to complete tasks consistent with personal preferences. Determining personal preferences for completing household chores, however, is challenging. Many household chores, such as setting a table or loading a dishwasher, are sequential and open-vocabulary, creating a [...]
Exploration for Continually Improving Robots
Abstract: Data-driven learning is a powerful paradigm for enabling robots to learn skills. Current prominent approaches involve collecting large datasets of robot behavior via teleoperation or simulation, to then train policies. For these policies to generalize to diverse tasks and scenes, there is a large burden placed on constructing a rich initial dataset, which is [...]
Domesticating Soft Robotics Research and Development with Accessible Biomaterials
Abstract: Current trends in robotics design and engineering are typically focused on high value applications where high performance, precision, and robustness take precedence over cost, accessibility, and environmental impact. In this paradigm, the capability landscape of robotics is largely shaped by access to capital and the promise of economic return. This thesis explores an alternative [...]
Moving Lights and Cameras for Better 3D Perception of Indoor Scenes
Abstract: Decades of research on computer vision have highlighted the importance of active sensing -- where an agent controls the parameters of the sensors to improve perception. Research on active perception in the context of robotic manipulation has demonstrated many novel and robust sensing strategies involving a multitude of sensors like RGB and RGBD cameras [...]
Trustworthy Learning using Uncertain Interpretation of Data
Abstract: Motivated by the potential of Artificial Intelligence (AI) in high-cost and safety-critical applications, and recently also by the increasing presence of AI in our everyday lives, Trustworthy AI has grown in prominence as a broad area of research encompassing topics such as interpretability, robustness, verifiable safety, fairness, privacy, accountability, and more. This has created [...]
Whisker-Inspired Sensors for Unstructured Environments
Abstract: Robots lack the perception abilities of animals, which is one reason they can not achieve complex control in outdoor unstructured environments with the same ease as animals. One cause of the perception gap is the constraints researchers place on the environments in which they test new sensors so algorithms can correctly interpret data from [...]
Differentiable Convex Modeling for Robotic Planning and Control
Abstract: Robotic simulation, planning, estimation, and control, have all been built on top of numerical optimization. In this same time, modern convex optimization has matured into a robust technology delivering globally optimal solutions in polynomial time. With advances in differentiable optimization and custom solvers capable of producing smooth derivatives, convex modeling has become fast, reliable, [...]
Towards a Universal Data Engine for Robotics and Beyond
Abstract: Robotics researchers have been attempting to extend data-driven breakthroughs in fields like computer vision and language processing into robot learning. However, unlike vision or language domains where massive amounts of data is readily available on the internet, training robotic policies relies on physical and interactive data collected via interacting with the physical world -- [...]