PhD Thesis Proposal
Carnegie Mellon University
Structured Learning for Robust Robot Manipulation
Abstract: Robust and generalizable robots that can autonomously manipulate objects in semi-structured environments can bring material benefits to society. Data-driven learning approaches are crucial for enabling such systems by identifying and exploiting patterns in semi-structured environments, allowing robots to adapt to novel scenarios with minimal human supervision. However, despite significant prior work in learning for [...]
Simulation-based Planning for Pick-and-Place in Heavy Clutter using Non-prehensile Manipulation
Abstract: Robot manipulation in domestic households, industrial manufacturing and warehouses might require contact-rich interactions with objects in the environment. For pick-and-place style grasping tasks in cluttered scenes, it can be more economical for the robot to rely on non-prehensile actions vis-à-vis deliberate prehensile rearrangement. Non-prehensile actions also let the robot manipulate large and bulky objects [...]
Carnegie Mellon University
Learning with Diverse Forms of Imperfect and Indirect Supervision
Abstract: High capacity Machine Learning (ML) models trained on large, annotated datasets have driven impressive advances in several fields including natural language processing and computer vision, in turn leading to impactful applications of ML in areas such as healthcare, e-commerce, and predictive maintenance. However, obtaining annotated datasets at the scale required for training such models [...]
Carnegie Mellon University
3D Representation Learning for Perception and Prediction: A Modular Yet Highly Integrated Approach
Abstract: Modularized and cascaded autonomy stacks (object detection, then tracking and then trajectory prediction) have been widely adopted in many autonomous systems such as self-driving cars due to its interpretability. In this talk, I advocate the use of such a modular approach but improve its accuracy and robustness by developing different 3D representations for each [...]
Carnegie Mellon University
Understanding Unbalanced Datasets Through Simple Models and Dataset Exploration
Abstract: Computer vision models have proven to be tremendously capable of recognizing and detecting several classes and objects. They succeed in classes widely ranging in type and scale from humans to cans to pens. However, the best performing classes have abundant examples in large-scale datasets today. In unbalanced datasets, where some categories are seen in [...]
Carnegie Mellon University
Self-Supervising Occlusions for Vision
Abstract: Virtually every scene has occlusions. Even a scene with a single object exhibits self-occlusions - a camera can only view one side of an object (left or right, front or back), or part of the object is outside the field of view. More complex occlusions occur when one or more objects block part(s) of [...]
Carnegie Mellon University
Massively Parallelized Lazy Planning Algorithms
Abstract: Search-based planning algorithms enable autonomous agents like robots to come up with well-reasoned long horizon plans to achieve a given task objective. They do so by optimizing a task-specific cost function while respecting the constraints on either the agent (e.g. motion constraints) or the environment (e.g. obstacles). In robotics, such as in motion planning [...]
Run-Time Optimization in the Deep Learning Age
Abstract: In a recovery task one seeks to obtain an estimate of an unknown signal from a set of incomplete measurements. These problems arise in a number of computer vision applications, from image based tasks such as super-resolution and in-painting to 3D reconstruction tasks such as Non-Rigid Structure from Motion and scene flow estimation. Early [...]
Carnegie Mellon University
System Identification and Control of Multiagent Systems Through Interactions
Abstract: This thesis investigates the problem of identifying dynamics models of individual agents of a multiagent system (MAS) and exploiting these models to shape their behavior using robots extrinsic to the MAS. While task-based control of a MAS using onboard controllers of its agents is well studied, we investigate (a) how easy it is for [...]
Carnegie Mellon University
Driving Reconfigurable Unmanned Vehicle Design for Mobility Performance
Abstract: Unmanned ground vehicles are being deployed in increasingly diverse and complex environments. Advances in the field of robotics, including perception technology, computing power, and machine learning, have brought robots from the lab to the real world. Remote and autonomous vehicles are now used to explore volcanoes, caves, pipes, war zones, disaster sites, and even [...]
Towards Large-scale and Long-term Neural Map Representations
Abstract: We address the problem of large-scale and long-term neural map representations. Maps, as our prior understanding toward the environment, provide valuable information for modern robotics applications such as autonomous driving and AR/VR. The size of maps largely affects the end task performance: usually a more detailed map can support better performance, but would cost [...]
Manipulating Objects with Challenging Visual and Geometric Properties
Abstract: Object manipulation is a well-studied domain in robotics, yet manipulation remains difficult for objects with visually and geometrically challenging properties. Visually challenging properties, such as transparency and specularity, break assumptions of Lambertian reflectance that existing methods rely on for grasp estimation. On the other hand, deformable objects such as cloth pose both visual and [...]
Large Scale Dense 3D Reconstruction via Sparse Representations
Abstract: Scene reconstruction systems take in (3D) videos as input, and output 3D models with associated poses for inputs. With the demand of 3D content generation, the technique has been drastically evolving in recent years. For professionals equipped with depth sensors, efficient dense reconstruction systems have become available to efficiently recover scene geometry. For ordinary [...]
Teaching Agent Reward Functions via Demonstrations for Human Inverse Reinforcement Learning
Abstract: For intelligent agents (e.g. robots) to be seamlessly integrated into human society, humans must be able to understand their decision making. For example, the decision making of autonomous cars must be clear to the engineers certifying their safety, passengers riding them, and nearby drivers negotiating the road simultaneously. As an agent's decision making can [...]
Policy Decomposition: Approximate Optimal Control with Suboptimality Estimates
Abstract: Optimal Control is a formulation for designing controllers for dynamical systems by posing it as an optimization problem, whereby the desired long-term behavior of the system is expressed using a cost function. The objective is to compute a policy, i.e. a mapping from the state of the system to its control inputs, that minimizes [...]
Audience-Aware Legibility for Social Navigation
Abstract: Robots often need to communicate their goals to humans when navigating in a shared space to assist observers in anticipating the robot’s future actions. These human observers are often scattered throughout the environment, and each observer only has a partial view of the robot and its movements. A path that non-verbally communicates with multiple [...]
On Sample-Efficient Reinforcement Learning for Nuclear Fusion
Abstract: In many practical applications of reinforcement learning (RL), it is expensive to observe state transitions from the environment. For example, in the problem of plasma control for nuclear fusion, determining the next state for a given state-action pair requires querying an expensive transition function which can lead to many hours of computer simulation or [...]
Carnegie Mellon University
Towards reconstructing non-rigidity from single camera
Abstract: In this proposal, we study how to infer 3D from images captured by a single camera, without assuming the target scenes / objects being static. The non-static setting makes our problem ill-posed and challenging to solve, but is vital in practical applications where target-of-interest is non-static. To solve ill-posed problems, the current trend in [...]
Efficient 3D Representations: Algebraic Surfaces for Differentiable Rendering
Abstract: In this proposal, we show how some classic computer vision tasks can robustly be solved via optimization techniques by using an object representation that is compact and interpretable. Specifically, we explore the applications and benefits of representing 3D objects with an analytical, algebraic function by building an approximate, ray-based differentiable renderer. Our approximate formulation [...]
Continual Robot Learning: Benchmarks and Modular Methods
Zoom Meeting Passcode: 841755 Abstract: The earliest reinforcement learning models were designed to learn one task, specified up-front. However, an agent operating freely in the real world will not in general be granted this luxury, as the demands placed on the agent may change as environments or goals change. We refer to this ever-shifting scenario [...]
Improving Robotic Exploration with Self-Supervision and Diverse Data
Abstract: Reinforcement learning (RL) holds great promise for improving robotics, as it allows systems to move beyond passive learning and interact with the world while learning from these interactions. A key aspect of this interaction is exploration: which actions should an RL agent take to best learn about the world? Prior work on exploration is typically [...]
Combining Offline Reinforcement Learning with Stochastic Multi-Agent Planning for Autonomous Driving
Abstract: Fully autonomous vehicles have the potential to greatly reduce vehicular accidents and revolutionize how people travel and how we transport goods. Many of the major challenges for autonomous driving systems emerge from the numerous traffic situations that require complex interactions with other agents. For the foreseeable future, autonomous vehicles will have to share the [...]
Causal Robot Learning for Manipulation
Abstract: Two decades into the third age of AI, the rise of deep learning has yielded two seemingly disparate realities. In one, massive accomplishments have been achieved in deep reinforcement learning, protein folding, and large language models. Yet, in the other, the promises of deep learning to empower robots that operate robustly in real-world environments [...]
Dense Reconstruction of Dynamic Structures from Monocular RGB Videos
Abstract: We study the problem of 3D reconstruction of {\em generic} and {\em deformable} objects and scenes from {\em casually-taken} RGB videos, to create a system for capturing the dynamic 3D world. Being able to reconstruct dynamic structures from casual videos allows one to create avatars and motion references for arbitrary objects without specialized devices, [...]
Learning via Visual-Tactile Interaction
Abstract: Humans learn by interacting with their surroundings using all of their senses. The first of these senses to develop is touch, and it is the first way that young humans explore their environment, learn about objects, and tune their cost functions (via pain or treats). Yet, robots are often denied this highly informative and [...]
Tactile SLAM: perception for dexterity via vision-based touch
Abstract: Touch provides a direct window into robot-object interaction, free from occlusion and aliasing faced by visual sensing. Collated tactile perception can facilitate contact-rich tasks---like in-hand manipulation, sliding, and grasping. Here, online estimates of object geometry and pose are crucial for downstream planning and control. With significant advances in tactile sensing, like vision-based touch, a [...]
Resource Allocation for Learning in Robotics
Abstract: Robots operating in the real world need fast and intelligent decision making systems. While these systems have traditionally consisted of human-engineered behaviors and world models, there has been a lot of interest in integrating them with data-driven components to achieve faster execution and reduce hand-engineering. Unfortunately, these learning-based methods require large amounts of training [...]
Planning with Dynamics by Interleaving Search and Trajectory Optimization
Abstract: Search-based planning algorithms enable autonomous agents like robots to come up with well-reasoned long-horizon plans to achieve a given task objective. They do so by searching over the graph that results from discretizing the state and action space. However, in robotics, several dynamically rich tasks require high-dimensional planning in the continuous space. For such [...]
Utilizing Panoptic Segmentation and a Locally-Conditioned Neural Representation to Build Richer 3D Maps
Abstract: Advances in deep-learning based perception and maturation of volumetric RGB-D mapping algorithms have allowed autonomous robots to be deployed in increasingly complex environments. For robust operation in open-world conditions however, perceptual capabilities are still lacking. Limitations of commodity depth sensors mean that complex geometries and textures cannot be reconstructed accurately. Semantic understanding is still [...]
Multi-Human 3D Reconstruction from Monocular RGB Videos
Abstract: We study the problem of multi-human 3D reconstruction from RGB videos captured in the wild. Humans have dynamic motion, and reconstructing them in arbitrary settings is key to building immersive social telepresence, assistive humanoid robots, and augmented reality systems. However, creating such a system requires addressing fundamental issues with previous works regarding the data [...]
Learning and Translating Temporal Abstractions across Humans and Robots
Abstract: Humans possess a remarkable ability to learn to perform tasks from a variety of different sources-from language, instructions, demonstration, etc. In each case, they are able to easily extract the high-level strategy to solve the task, such as the recipe of cooking a dish, whilst ignoring irrelevant details, such as the precise shape of [...]
Predicting The Future and Linking the Past: Learning and Constructing Structured Models for Robotic Manipulation
Abstract: Intelligent robotic agents need to reason about the dynamics of their surrounding world, and use such dynamics reasoning to make future predictions for efficient task planning. In addition, it is also desirable for robots to associate past experience in their memories to their current observation, and conduct analogical reasoning to complete tasks at their [...]
Perception for High-Speed Off-Road Driving
Abstract: On-road autonomous driving has seen rapid progress in recent years with driverless vehicles being tested in various cities worldwide. However, this progress is limited to cities with well-established infrastructure and has yet to transfer to off-road regimes with unstructured environments and few paved roads. Advances in high-speed and reliable autonomous off-road driving can unlock [...]
Continual Learning of Compositional Skills for Robust Robot Manipulation
Abstract: Real world robots need to continuously learn new manipulation tasks in a lifelong learning manner. These new tasks often share sub-structures (in the form of sub-tasks, controllers) with previously learned tasks. To utilize these shared sub-structures, we explore a compositional and object-centric approach to learn manipulation tasks. While compositionality in robot manipulation can manifest [...]
Equivalent Policy Sets for Learning Aligned Models and Abstractions
Abstract: Recent successes in model-based reinforcement learning (MBRL) have demonstrated the enormous value that learned representations of environmental dynamics (i.e., models) can impart to autonomous decision making. While a learned model can never perfectly represent the dynamics of complex environments, models that are accurate in the "right” ways may still be highly useful for decision [...]
Adaptive Robotic Assistance through Observations of Human Behavior
Abstract: Assistive robots should take actions that support people's goals. This is especially true as robots enter into environments where personal agency is paramount, such as a person's home. Home environments have a wide variety of "optimal' solutions that depend on personal preference, making it difficult for a robot to know the goal it should [...]
Beyond Pick-and-Place: Towards Dynamic and Contact-rich Motor Skills with Reinforcement Learning
Abstract: Interactions with the physical world are at the core of robotics. However, robotics research, especially in manipulation, has been mainly focused on tasks with limited interactions with the physical world such as pick-and-place or pushing objects on the table top. These interactions are often quasi-static, have predefined or limited sequence of contact events and [...]
Adaptive-Anytime Planning and Mapping for Multi-Robot Exploration in Large Environments
Abstract: Robotic systems are being leveraged to explore environments too hazardous for humans to enter. Robot sensing, compute, and kinodynamic (SCK) capabilities are inextricably tied to the size, weight, and power (SWaP) constraints of the vehicle. When designing a robot team for exploration, the diversity and types of robots used must be carefully considered because [...]
Enabling Data-Efficient Real-World Model-Based Manipulation by Estimating Preconditions for Inaccurate Models
Abstract: This thesis explores estimating and reasoning about model deviation in robot learning for manipulation to improve data efficiency and reliability to enable real-robot manipulation in a world where models are inaccurate but still useful. Existing strategies are presented for improving planning robustness with low amounts of real-world data by an empirically estimated model precondition to guide [...]
Robust Adaptive Reinforcement Learning for Safety Critical Applications via Curricular Learning
Abstract: Reinforcement Learning (RL) presents great promises for autonomous agents. However, when using robots in a safety critical domain, a system has to be robust enough to be deployed in real life. For example, the robot should be able to perform across different scenarios it will encounter. The robot should avoid entering undesirable and irreversible [...]
Towards Photorealistic Dynamic Capture and Animation of Human Hair and Head
Abstract: Realistic human avatars play a key role in immersive virtual telepresence. To reach a high level of realism, a human avatar needs to faithfully reflect human appearance. A human avatar should also be drivable and express natural motions. Existing works have made significant progress on building drivable realistic face avatars, but they rarely include [...]
Eye Gaze for Intelligent Driving
Abstract: Intelligent vehicles have been proposed as one path to increasing vehicular safety and reduce on-road crashes. Driving intelligence has taken many forms, ranging from simple blind spot occupancy or forward collision warnings to lane keeping and all the way to full driving autonomy in certain situations. Primarily, these methods are outward-facing and operate on [...]
Passive Coupling in Robot Swarms
Abstract: In unstructured environments, ant colonies demonstrate remarkable abilities to adaptively form functional structures in response to various obstacles, such as stairs, gaps, and holes. Drawing inspiration from these creatures, robot swarms can collectively exhibit complex behaviors and achieve tasks that individual robots cannot accomplish. Existing modular robot platforms that employ dynamic coupling and decoupling [...]
Learning to Perceive and Predict Everyday Interactions
Abstract: This thesis aims to develop a computer vision system that can understand everyday human interactions with rich spatial information. Such systems can benefit VR/AR to perceive the reality and modify its virtual twin, and robotics to learn manipulation by watching human. Previous methods have been limited to constrained lab environment or pre-selected objects with [...]
Active Vision for Manipulation
Abstract: Decades of research on computer vision has highlighted the importance of active sensing -- where the agent actively controls parameters of the sensor to improve perception. Research on active perception the context of robotic manipulation has demonstrated many novel and robust sensing strategies involving a multitude of sensors like RGB and RGBD cameras, a [...]
Design Iteration of Dexterous Compliant Robotic Manipulators
Abstract: One goal of personal robotics is to have robots in homes performing everyday tasks efficiently to improve our quality of life. Towards this end, manipulators are needed which are low cost, safe around humans, and approach human-level dexterity. However, existing off-the-shelf manipulators are expensive both in cost and manufacturing time, difficult to repair, and [...]
Whisker Sensors for Unstructured Environments
Abstract: As robot applications expand from controllable factory settings to unknown environments, the robots will need a larger breadth of sensors to perceive these complex environments. In this thesis, I focus on developing whisker sensors for robot perception. The inspiration for whisker sensors comes from the biological world, where whiskers serve as tactile and flow [...]
Sparse-view 3D in the Wild
Abstract: Reconstructing 3D scenes and objects from images alone has been a long-standing goal in computer vision. We have seen tremendous progress in recent years, capable of producing near photo-realistic renderings from any viewpoint. However, existing approaches generally rely on a large number of input images (typically 50-100) in order to compute camera poses and [...]
Carnegie Mellon University
Spectral Mapping using Simple Sensors for Micro-Explorers
Abstract: Spectral mapping is an essential task in exploration as it expands our understanding of material composition in an explored region. Although imaging spectrometers are ideal for obtaining spectra to construct spectral maps, their large size, high power consumption, and operational complexity make them impractical for small rovers and limited missions. In contrast, RGB cameras [...]
Simulation-driven vision-based tactile sensor design using Physics Based Rendering
Abstract: Touch is an essential sensing modality for making autonomous robots more dexterous and works collaboratively with humans. With the advent of vision-based tactile sensors, roboticists have tried to incorporate tactile sensors in various robot structures for various robotic manipulation tasks to increase robustness, precision, and reliability. However, the design of vision-based tactile sensors is [...]
Efficient Interactive Learning with Unobserved Confounders
Abstract: Interactive learning systems like self-driving cars, recommender systems, and large language model chatbots are becoming increasingly ubiquitous in everyday life. From a machine learning perspective, the key technical challenge underlying such systems is that rather than simple prediction on i.i.d. data, an interactive learner influences the distribution of inputs it sees via the choices [...]