MSR Thesis Defense
Carnegie Mellon University
Learning with Auxiliary Supervision
Abstract: Supervised learning for high-level vision tasks has advanced significantly over the last decade. One of the primary driving forces for these improvements has been the availability of vast amounts of labeled data. However, annotating data is an expensive and time-consuming process. For example, densely segmenting a natural scene image takes approximately 30 minutes. This mode [...]
Inverse Reinforcement Learning with Conditional Choice Probabilities
Abstract: We make an important connection to existing results in econometrics to describe an alternative formulation of inverse reinforcement learning (IRL). In particular, we describe an algorithm to solve the IRL problem, using easy-to-compute estimates of the Conditional Choice Probability (CCP) vector, which is the policy function of an expert integrated over factors econometricians cannot [...]
Monocular Depth Reconstruction using Geometry and Deep Networks
In this thesis, we explore methods of building dense depth map from monocular video. First, we introduce our multi-view stereo pipeline, which utilizes photometric bundle adjustment for getting accurate depth of textured regions from small motion video. Second, we improve the depth estimation of low-texture region by fusing deep convolutional network predictions. We categorize the [...]
Carnegie Mellon University
Learning Depth from Monocular Videos using Direct Methods
The ability to predict depth from a single image - using recent advances in CNNs - is of increasing interest to the vision community. Unsupervised strategies to learning are particularly appealing as they can utilize much larger and varied monocular video datasets during learning without the need for ground truth depth or stereo. In previous works, separate pose and [...]
Carnegie Mellon University
Learning-based Lane Following and Changing Behaviors for Autonomous Vehicle
This thesis explores learning-based methods in generating human-like lane following and changing behaviors in on-road autonomous driving. We summarize our main contributions as: 1) derive an efficient vision-based end-to-end learning system for on-road driving; 2) propose a novel attention-based learning architecture with sub-action space to obtain lane changing behavior using a deep reinforcement learning algorithm; [...]
Carnegie Mellon University
Real-to-Virtual Domain Unification for End-to-End Autonomous Driving
Abstract: In the spectrum of vision-based autonomous driving, vanilla end-to-end models are not interpretable and suboptimal in performance, while mediated perception models require additional intermediate representations such as segmentation masks or detection bounding boxes, whose annotation can be prohibitively expensive as we move to a larger scale. More critically, all prior works fail to deal with the notorious [...]
Carnegie Mellon University
Reconstruction of dynamic vehicles from multiple unsynchronized cameras
Despite significant research in the area, reconstruction of multiple dynamic rigid objects (eg. vehicles) observed from wide-baseline, uncalibrated and unsynchronized cameras, remains hard. On one hand, feature tracking works well within each view but is hard to correspond across multiple cameras with limited overlap in fields of view or due to occlusions. On the other [...]
Carnegie Mellon University
Ergodic Coverage and Active Search in Constrained Environments
In this thesis, we explore sampling-based trajectory optimization applied to search for objects of interest in constrained environments (e.g., a UAV searching for a target in the presence of obstacles). We consider two search scenarios: in the first scenario, accurate prior information distribution of the possible locations of the objects of interest is available, thus [...]
Carnegie Mellon University
Understanding Machine Vision through Human Vision
Abstract: Recent success in machine vision has been largely driven by advanced computer vision methods, most commonly known as deep learning based methods. While we have seen tremendous performance improvements in machine visual tasks, such as object categorization and segmentation, there remain two major issues in deep learning. Firstly, deep networks have been largely unable [...]
Carnegie Mellon University
Automated design, accessible fabrication, and learning-based control on cable-driven soft robots with complex shapes
The emerging field of soft robots has shown great potential to outperform their rigid counterparts due to the soft and safe nature and the capability of performing complex and compliant motions. Many are built, but the designs are conservative and limited to regular shapes. The widely-used fabrication method contains bulky pumps, tethered tubings, and silicone [...]
What can this robot do? Learning Capability Models from Appearance and Experiments
As autonomous robots become increasingly multifunctional and adaptive, it becomes difficult to determine the extent of their capabilities, i.e. the tasks they can perform and their strengths and limitations at these tasks. A robot's appearance can provide cues to its physical as well as cognitive capabilities. We present an algorithm that builds on these cues [...]
Carnegie Mellon University
Robust State Estimation for Micro Aerial Vehicles
Title: Robust State Estimation for Micro Aerial Vehicles Autonomous robots provide excellent tools for information gathering in a wide variety of domains, from environmental management to infrastructure inspection and search and rescue. Micro aerial vehicles, in particular, offer a high degree of mobil- ity that can further their effectiveness in such environments. Deployment of aerial [...]
Deep Reinforcement Learning with skill library: Learning and exploration with temporal abstractions using coarse approximate dynamics models
Reinforcement learning is a computational approach to learn from interaction. However, learning from scratch using reinforcement learning requires exorbitant number of interactions with the environment even for simple tasks. One way to alleviate the problem is to reuse previously learned skills as done by humans. This thesis provides frameworks and algorithms to build and reuse [...]
Carnegie Mellon University
Semantic Segmentation for Terrain Roughness Estimation Using Data Autolabeled with a Custom Roughness Metric
Traditional methods for off-road terrain estimation use some type of learning network to predict hand labeled classes of terrain such as short grass, tall grass, dirt, and trees. Other methods of learning which can give more detailed, but stilldiscrete classes, use on board sensors to measure the terrain roughness, and then predict the terrain type. There also exists [...]
Carnegie Mellon University
Automated Design of Manipulators For In-Hand Tasks
Grasp planning and motion synthesis for dexterous manipulation tasks are traditionally done given a pre-existing kinematic model for the robotic hand. In this paper, we introduce a framework for automatically designing hand topologies best suited for manipulation tasks given high level objectives as input. Our goal is to ultimately design a program that is able [...]
Learning Neural Parsers with Deterministic Differentiable Imitation Learning
Abstract: In this work, we explore the problem of learning to decompose spatial tasks into segments, as exemplified by the problem of a painting robot covering a large object. Inspired by the ability of classical decision tree algorithms to construct structured partitions of their input spaces, we formulate the problem of decomposing objects into segments [...]
Carnegie Mellon University
Integrating Structure with Deep Reinforcement and Imitation Learning
Most deep reinforcement and imitation learning methods are data-driven and do not utilize the underlying structure of the problem. While these methods have achieved great success on many challenging tasks, several key problems such as generalization, data efficiency and compositionality remain open. Utilizing problem structure in the form of architecture design, priors, domain knowledge etc. may [...]
Carnegie Mellon University
Learning Reactive Flight Control Policies: from LIDAR measurements to Actions
Abstract The end goal of a reactive flight control pipeline is to output control commands based on local sensor inputs. Classical state estimation and control algorithms break down this problem by first estimating the robot’s velocity and then computing a roll and pitch command based on that velocity. However, this approach is not robust in [...]
Carnegie Mellon University
Transparency in Deep Reinforcement Learning Networks
In the recent years there has been a growing interest in the field of Explainability for machine learning models in general and deep learning in particular. This is because, deep learning based approaches have made tremendous progress in the field of computer vision, reinforcement learning, language related domains and are being increasingly used in application areas [...]
Carnegie Mellon University
Geometric approaches to motion planning for two classes of low-Reynolds number swimmers
Microrobots have the potential to impact many areas of medicine such as microsurgery, targeted drug delivery and minimally invasive sensing. Just like microorganisms themselves, microrobots developed for these applications need to swim in a low-Reynolds number regime which warrants locomotive strategies that differ from their macroscopic counterparts. To this end, Purcell’s three-link planar swimmer has [...]
Carnegie Mellon University
Autonomous 3D Reconstruction in Underwater Unstructured Scenes
Abstract Reconstruction of marine structures such as pilings underneath piers presents a plethora of interesting challenges. It is one of those tasks better suited to a robot due to harsh underwater environments. Underwater reconstruction typically involves human operators remotely controlling the robot to predetermined way-points based on some prior knowledge of the location and model [...]
Carnegie Mellon University
Wire Detection, Reconstruction, and Avoidance for Unmanned Aerial Vehicles
Abstract Thin objects, such as wires and power lines are one of the most challenging obstacles to detect and avoid for UAVs, and are a cause of numerous accidents each year. This thesis makes contributions in three areas of this domain: wire segmentation, reconstruction, and avoidance. Pixelwise wire detection can be framed as a binary [...]
The Art of Robotics: Toward a Holistic Approach
I arrived at the Robotics Institute two years ago looking for a good project, something tangible and preferably related to legged locomotion. Instead, I met Matt Mason and started to think about the big picture, ask the big questions. What is manipulation? What is robotics? What makes robotics particularly hard? To answer these questions, I [...]
Carnegie Mellon University
Mapping gamma sources and their flux fields using non-directional flux measurements
There is a compelling need to determine the location and activity of radiation sources from the flux that they generate. There is also a need to create dense flux maps from sparse measurements. This research solves these dual problems. An example of a situation where these capabilities would be vital is at the location of [...]
Carnegie Mellon University
Automated Design of Special Purpose Dexterous Manipulators
Grasp planning and motion synthesis for dexterous manipulation tasks are traditionally done given a pre-existing kinematic model for the robotic hand. In this thesis, we introduce a framework for automatically designing hand topologies best suited for manipulation tasks given high level objectives as input. Our goal is to ultimately design a program that is able [...]
Carnegie Mellon University
Toward Invariant Visual Inertial State Estimation using Information Sparsification
Abstract In this work, we address two current challenges in real-time visual-inertial odometry (VIO) systems - efficiency and accuracy. To this end, we present a novel approach to tightly couple visual and inertial measurements in a fixed-lag VIO framework using information sparsification. To bound computational complexity, fixed-lag smoothers perform marginalization of variables but consequently deteriorate accuracy and [...]
Generative Point Cloud Modeling with Gaussian Mixture Models for Multi-Robot Exploration
Autonomous exploration in rich 3D environments requires the construction and maintenance of a representation derived from accumulated 3D observations. Volumetric models, which are commonly employed to enable joint reasoning about occupied and free space, scale poorly with the size of the environment. Techniques employed to mitigate this scaling include hierarchical discretization, learning local data summarizations [...]
Carnegie Mellon University
Integrating Model-based Planning with Skill learning for Mobile Manipulation
With an ever-growing demand to automate different day-to-day activities, the task of autonomous manipulation using articulated robots has gained serious traction lately. In this regard, motion planning for manipulation is one of the highly researched topics. The Motion planning for manipulation is often cast as either a model-based planning problem or a machine learning problem. However, both of these [...]
Carnegie Mellon University
In-Field Robotic Leaf Grasping and Automated Crop Spectroscopy
Agricultural robotics is a growing field of intelligent automation that is proving to drastically increase the speed and reliability of in-field tasks such as precision seed planting, harvesting, field mapping, and crop monitoring. More specifically, plant breeders are beginning to use robotic systems to record the physical traits of crops throughout the growing season at [...]
Carnegie Mellon University
Soft-matter Artificial Muscle by Electrochemical Surface Oxidation of Liquid Metal
Natural muscles, a result of more than 500 millions years of evolution, are elegant machines that generate force and motion electrochemically. The brief history of robotics does not have the luxury of millions of years to reverse-engineer many aspects of life. The development of artificial muscles therefore seeks to build more muscle-like actuators for robots. [...]
Radiation Source Localization using a Gamma-ray Camera
Radiation source localization is a common and critical task across applications such as nuclear facility decommissioning, radioactive disaster response, and security. Traditional count-based sensors (e.g. Geiger counters) infer range to the source based on the observed number of gamma photons, expected source strength, and assumed intermediate attenuation from the environment. In cluttered 3D settings, such [...]
Carnegie Mellon University
Improving Imitation Learning through Efficient Expert Querying
Learning from demonstration is an intuitive approach to encoding complex behaviors in autonomous agents. Learners have shown success in challenging tasks like autonomous driving, aerial obstacle avoidance, and information gathering, through observation and mimicry alone. State of the art algorithms like Dataset Aggregation (DAgger) have made significant advances over traditional behavior cloning, demonstrating strong theoretical [...]
Failure Is an Option: How the Severity of Robot Errors Affects Human-Robot Interactions
Abstract: Just as humans are imperfect, even the best of robots will eventually fail at performing a task. The likelihood of failure increases as robots expand their roles in our lives. Although task failure is a common problem in robotics and human-robot interaction (HRI), there has been little research investigating human tolerance to said failures, [...]
Carnegie Mellon University
MSR Thesis Talk: Avi Rudich
Title: Kinematic Analysis of 3D Printed Flexible Delta Robots Abstract: Flexible Delta robots show significant promise for use in a wide array of manipulation tasks. They are simple to design and manufacture, and they maintain a high level of repeatability and precision in open loop control. This thesis analyzes the kinematic properties of flexible [...]
Learning Parameter-Efficient Quadrotor Dynamics Models
Abstract: Operation of quadrotors through high-speed, high-acceleration maneuvers remains a challenging problem due to the complex aerodynamics in this regime. While standard physical models suffice for control in near-hover conditions, the primary challenge in executing aggressive trajectories is obtaining a model for the quadrotor dynamics that adequately models the aerodynamic effects present, including lift, drag, [...]
Human-in-the-loop Model Creation
Abstract: Deep generative models make visual content creation more accessible to novice users by automating the synthesis of diverse, realistic content based on a collected dataset. However, the current machine learning approaches miss several elements of the creative process -- the ability to synthesize things that go far beyond the data distribution and everyday experience, [...]
Learning Models and Cost Functions from Unlabeled Data for Off-Road Driving
Abstract: Off-road driving is an important instance of navigation in unstructured environments, which is a key robotics problem with many applications, such as exploration, agriculture, disaster response and defense. The key challenge in off-road driving is to be able to take in high dimensional, multi-modal sensing data and use it to make intelligent decisions on [...]
MSR Thesis Talk: Chonghyuk Song
Title: Total-Recon: Deformable Scene Reconstruction for Embodied View Synthesis Abstract: We explore the task of embodied view synthesis from monocular videos of deformable scenes. Given a minute-long RGBD video of people interacting with their pets, we render the scene from novel camera trajectories derived from in-scene motion of actors: (1) egocentric cameras that simulate the point [...]
MSR Thesis Talk: Shivam Duggal
Title: Learning Single Image 3D Reconstruction from Single-View Image Collections Abstract We present a framework for learning 3D object shapes and dense cross-object 3D correspondences from just an unaligned category-specific image collection. The 3D shapes are generated implicitly as deformations to a category-specific signed distance field and are learned in an unsupervised manner solely from unaligned [...]
MSR Thesis Talk: Himangi Mittal
Title: Audio-Visual State-Aware Representation Learning from Interaction-Rich Data Abstract In robotics and augmented reality, the input to the agent is a long stream of video from the first-person or egocentric point of view. Recently, there have been significant efforts to capture humans from their first-person/egocentric view interacting with their own environment as they go about [...]
MSR Thesis Talk: Ken Liu
Title: On Privacy and Personalization in Federated Learning: Analyses and Applications Abstract: Recent advances in machine learning often rely on large and centralized datasets. However, curating such data can be challenging when they hold private information, and policies/regulations may mandate that they remain distributed across data silos (e.g. mobile devices or hospitals). Federated learning (FL) [...]
Carnegie Mellon University
MSR Thesis Talk: Haolun Zhang
Title: Seeing in 3D: Towards Generalizable 3D Visual Representations for Robotic Manipulation Abstract: Despite the recent progress in computer vision and deep learning, robot perception remains a tremendous challenge due to the variations of the objects and the scenes in manipulation tasks. Ideally, a robot trying to manipulate a new object should be able to [...]
MSR Thesis Talk: Muyang Li
Title: Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models Abstract: During image editing, existing deep generative models tend to re-synthesize the entire output from scratch, including the unedited regions. This leads to a significant waste of computation, especially for minor editing operations. In this work, we present Spatially Sparse Inference (SSI), a general-purpose technique [...]
MSR Thesis Talk: Rohan Zeng
Title: Spectral Unmixing and Mapping of Coral Reef Benthic Cover Abstract: Coral reefs are important to the global ecosystem and the local communities and wildlife that rely on the habitat they create. However, coral reefs are also in critical and rapid decline: reefs have degraded over recent decades and what remains is at increasing risk [...]
MSR Thesis Talk: Ashwin Misra
Title: Learn2Plan: Learning variable ordering heuristics for scalable task planning Abstract: Traditional approaches to planning attempt to transform a system into a goal state by applying specific actions in a specific order. In these methods, there is an exponential search space due to considering many possible actions at every decision point. Hierarchical Task Networks use incremental [...]
MSR Thesis Talk: Andrew Jong
Title: Robot Information Gathering for Dynamic Systems in Wildfire Scenarios Abstract: The monitoring of complex dynamic systems, such as those encountered in disaster response, search and rescue, wildlife conservation, and environmental monitoring, presents the fundamental challenge of how to track efficiently with limited resources and partial observability. This thesis presents algorithms and techniques for robotic [...]
MSR Thesis Talk: Erin Wong
Title: Edge Detection by Centimeter Scale Low-Cost Mobile Robots Abstract: In Search and Rescue (SaR) efforts after natural disasters like earthquakes, the primary focus is to find and rescue people in building rubble. These rescue efforts could put first responders at risk and are slow due to the unstable nature of the environment. Robotic solutions [...]
MSR Thesis Talk: Sarvesh Patil
Title: Soft Delta Robots for Dexterous Manipulation Abstract: Dexterous manipulation capabilities of end-effectors afford us a wide range of strategies for fine-grained manipulation tasks. Recent utilization of readily available materials like soft filaments and silicone elastomers has enabled the development of low-cost mechanically intelligent robotic manipulators. This is important for democratizing robot manipulation and increasing [...]
MSR Thesis Talk: Fan Yang
Title: Exploring Safe Reinforcement Learning for Sequential Decision Making Abstract: Safe Reinforcement Learning (RL) focuses on the problem of training a policy to maximize the reward while ensuring safety. It is an important step towards applying RL to safety-critical real-world applications. However, safe RL is challenging due to the trade-off between the two objectives [...]