3D Recognition with self-supervised learning and generic architectures
Abstract: Supervised learning relies on manual labeling which scales poorly with the number of tasks and data. Manual labeling is especially cumbersome for 3D recognition tasks such as detection and segmentation and thus most 3D datasets are surprisingly small compared to image or video datasets. 3D recognition methods are also fragmented based on the type [...]
Carnegie Mellon University
Heuristics for routing and scheduling of Spatio-temporal type problems in industrial environments
Abstract: Spatio-temporal problems are fairly common in industrial environments. In practice, these problems come with different characteristics and are often very hard to solve optimally. So, practitioners prefer to develop heuristics that exploit mathematical structure specific to the problem for obtaining good performance. In this thesis, we will present work on heuristics for 3 different [...]
Computational Light Transport with Interferometry
Abstract: Optical interferometry is the measurement of small, sub-wavelength distances by exploiting the wave nature of light. Due to its capability to resolve micron-scale displacements, it has found widespread applications in biomedical imaging, industrial fabrication, physics, and astrophysics. In this thesis, we introduce a set of techniques we call computational interferometry, that bring the benefits [...]
Rapid Adaptation for Robot Learning
Abstract: How can we train a robot to generalize to diverse environments? This question underscores the holy grail of robot learning research because it is difficult to supervise an agent for all possible situations it can encounter in the future. We posit that the only way to guarantee such a generalization is to continually learn and [...]
Carnegie Mellon University
3D Reconstruction using Differential Imaging
Abstract: 3D reconstruction has been at the core of many computer vision applications, including autonomous driving, visual inspection in manufacturing, and augmented and virtual reality (AR/VR). Despite the tremendous progress made over the years, there remain challenging open-research problems. This thesis addresses three such problems in 3D reconstruction. First, we address the problem of defocus [...]
Robotic Cave Exploration for Search, Science, and Survey
Abstract: Robotic cave exploration has the potential to create significant societal impact through facilitating search and rescue, in the fight against antibiotic resistance (science), and via mapping (survey). But many state-of-the-art approaches for active perception and autonomy in subterranean environments rely on disparate perceptual pipelines (e.g., pose estimation, occupancy modeling, hazard detection) that process the same underlying sensor data in [...]
Humans, hands, and horses: 3D reconstruction of articulated object categories using strong, weak, and self-supervision
Abstract: Reconstructing 3D objects from a single 2D image is a task that humans perform effortlessly, yet computer vision so far has only robustly solved 3D face reconstruction. In this talk we will see how we can extend the scope of monocular 3D reconstruction to more challenging, articulated categories such as human bodies, hands and [...]
Enabling Grounded Language Communication for Human-Robot Teaming
Abstract: The ability for robots to effectively understand natural language instructions and convey information about their observations and interactions with the physical world is highly dependent on the sophistication and fidelity of the robot’s representations of language, environment, and actions. As we progress towards more intelligent systems that perform a wider range of tasks in a [...]
Looking behind the Seen in Order to Anticipate
Abstract: Despite significant recent progress in computer vision and machine learning, personalized autonomous agents often still don’t participate robustly and safely across tasks in our environment. We think this is largely because they lack an ability to anticipate, which in turn is due to a missing understanding about what is happening behind the seen, i.e., [...]
Robots that Learn through Language
Abstract: Advances in perception have been integral to transitioning robots from machines restricted to factory automation to autonomous agents that operate robustly in unstructured environments. As our surrogates, robots enable people to explore the deepest depths of the ocean and distant regions of space, making discoveries that would otherwise be impossible. The age of robots [...]
Towards Reconstructing Any Object in 3D
Abstract: The world we live in is incredibly diverse, comprising of over 10k natural and man-made object categories. While the computer vision community has made impressive progress in classifying images from such diverse categories, the state-of-the-art 3D prediction systems are still limited to merely tens of object classes. A key reason for this stark difference [...]
Carnegie Mellon University
Beyond rigid objects: Data-driven Methods for Manipulation of Deformable Objects
Abstract: Manipulation of deformable objects challenges common assumptions made for rigid objects. Deformable objects have high intrinsic state representation and complex dynamics with high degrees of freedom, making it difficult for state estimation and planning. The completed work can be divided into two parts. In the first part, we explore reinforcement learning (RL) as a [...]
Carnegie Mellon University
Simulation, Perception, and Generation of Human Behavior
Abstract: Understanding and modeling human behavior is fundamental to almost any computer vision and robotics applications that involve humans. In this thesis, we take a holistic approach to human behavior modeling and tackle its three essential aspects --- simulation, perception, and generation. Throughout this thesis, we show how the three aspects are deeply connected and [...]
The Clinician’s AI Partner: Augmenting Clinician Capabilities Across the Spectrum of Healthcare
Abstract: Clinicians often work under highly demanding conditions to deliver complex care to patients. As our aging population grows and care becomes increasingly complex, physicians and nurses are now also experiencing feelings of burnout at unprecedented levels. In this talk, I will discuss possibilities for computer vision to function as a partner to clinicians, and to augment their capabilities, across [...]
The Unusual Effectiveness of Abstractions for Assistive AI
Abstract: Can we balance efficiency and reliability while designing assistive AI systems? What would such AI systems need to provide? In this talk I will present some of our recent work addressing these questions. In particular, I will show that a few fundamental principles of abstraction are surprisingly effective in designing efficient and reliable AI [...]
Reliable and Accessible Visual Recognition
Abstract: As visual recognition models are developed across diverse applications; we need the ability to reliably deploy our systems in a variety of environments. At the same time, visual models tend to be trained and evaluated on a static set of curated and annotated data which only represents a subset of the world. In this [...]
Fake It Till You Make It: Face analysis in the wild using synthetic data alone
Abstract: In this seminar I will demonstrate how synthetic data alone can be used to perform face-related computer vision in the wild. The community has long enjoyed the benefits of synthesizing training data with graphics, but the domain gap between real and synthetic data has remained a problem, especially for human faces. Researchers have tried [...]
Carnegie Mellon University
Structured Learning for Robust Robot Manipulation
Abstract: Robust and generalizable robots that can autonomously manipulate objects in semi-structured environments can bring material benefits to society. Data-driven learning approaches are crucial for enabling such systems by identifying and exploiting patterns in semi-structured environments, allowing robots to adapt to novel scenarios with minimal human supervision. However, despite significant prior work in learning for [...]
Robotics and Warehouse Automation at Berkshire Grey
Abstract: This talk tells the Berkshire Grey story, from its founding in 2013 to its IPO earlier this year — the first robotics IPO since iRobot over15 years ago. Berkshire Grey produces automated systems for e-commerce order fulfillment, parcel sortation, store replenishment, and related operations in warehouses, distribution centers, and in the back ends of [...]
An Experimental Design Perspective on Model-Based Reinforcement Learning
Abstract: In many practical applications of RL, it is expensive to observe state transitions from the environment. For example, in the problem of plasma control for nuclear fusion, computing the next state for a given state-action pair requires querying an expensive transition function which can lead to many hours of computer simulation or dollars of [...]
Learning Model Preconditions for Planning with Multiple Models
Abstract: Different models can provide differing levels of fidelity when a robot is planning. Analytical models are often fast to evaluate but only work in limited ranges of conditions. Meanwhile, physics simulators are effective at modeling complex interactions between objects but are typically more computationally expensive. Learning when to switch between the various models can [...]
Leveraging StyleGAN for Image Editing and Manipulation
Abstract: StyleGAN has recently been established as the state-of-the-art unconditional generator, synthesizing images of phenomenal realism and fidelity, particularly for human faces. With its rich semantic space, many works have attempted to understand and control StyleGAN’s latent representations with the goal of performing image manipulations. To perform manipulations on real images, however, one must learn to [...]
Resilient Exploration in SubT Environments: Team Explorer’s Approach and Lessons Learned in the Final Event
Abstract: Subterranean robot exploration is difficult with many mobility, communications, and navigation challenges that require an approach with a diverse set of systems, and reliable autonomy. While prior work has demonstrated partial successes in addressing the problem, here we convey a comprehensive approach to address the problem of subterranean exploration in a wide range of [...]
Simulation-based Planning for Pick-and-Place in Heavy Clutter using Non-prehensile Manipulation
Abstract: Robot manipulation in domestic households, industrial manufacturing and warehouses might require contact-rich interactions with objects in the environment. For pick-and-place style grasping tasks in cluttered scenes, it can be more economical for the robot to rely on non-prehensile actions vis-à-vis deliberate prehensile rearrangement. Non-prehensile actions also let the robot manipulate large and bulky objects [...]
Carnegie Mellon University
Relationships in instance segmentation and anomaly detection
Abstract: This thesis primarily covers work on two different tasks in computer vision: (1) anomaly detection and (2) instance segmentation. Anomaly detection is an underexplored unsupervised problem that has existed in many fields. On the other hand, instance (and panoptic) segmentation is a supervised problem that can leverage the powerful data and key developments from [...]
Next-Gen Video Communication
Abstract: Video communication connects our world. It is necessary in conducting business, educational and personal activities across different geographical locations. However, the quality of an average user’s video communication is dramatically worse than that of professionally created videos in news broadcasts, talk shows, and on YouTube. This is because professionally created videos are often captured with [...]
Carnegie Mellon University
Learning with Diverse Forms of Imperfect and Indirect Supervision
Abstract: High capacity Machine Learning (ML) models trained on large, annotated datasets have driven impressive advances in several fields including natural language processing and computer vision, in turn leading to impactful applications of ML in areas such as healthcare, e-commerce, and predictive maintenance. However, obtaining annotated datasets at the scale required for training such models [...]
MRSD Annual Poster Presentation
Four student teams from the MRSD program will use posters, videos, and hardware to show their project work on robots for room disinfection, search & rescue, increasing human capability via a third arm, and increased-efficiency factory-floor obstacle avoidance.
Carnegie Mellon University
3D Representation Learning for Perception and Prediction: A Modular Yet Highly Integrated Approach
Abstract: Modularized and cascaded autonomy stacks (object detection, then tracking and then trajectory prediction) have been widely adopted in many autonomous systems such as self-driving cars due to its interpretability. In this talk, I advocate the use of such a modular approach but improve its accuracy and robustness by developing different 3D representations for each [...]
Carnegie Mellon University
MSR Thesis Talk: Avi Rudich
Title: Kinematic Analysis of 3D Printed Flexible Delta Robots Abstract: Flexible Delta robots show significant promise for use in a wide array of manipulation tasks. They are simple to design and manufacture, and they maintain a high level of repeatability and precision in open loop control. This thesis analyzes the kinematic properties of flexible [...]
Reconstructing common objects to interact with
Abstract: We humans are able to understand 3D shapes of common daily objects and interact with them from a wide range of categories. We understand cups are usually cylinder-like and we can easily predict the shape of one particular cup, both in isolation or even when it is held by a human. We aim to [...]
Activity Understanding of Scripted Performances
Abstract: The PSU Taichi for Smart Health project has been doing a deep-dive into vision-based analysis of 24-form Yang-style Taichi (TaijiQuan). A key property of Taichi, shared by martial arts katas and prearranged form exercises in other sports, is practice of a scripted routine to build both mental and physical competence. The scripted nature of routines [...]
Carnegie Mellon University
Dynamical Model Learning and Inversion for Aggressive Quadrotor Flight
Abstract: Quadrotor applications have seen a surge recently and many tasks require precise and accurate controls. Flying fast is critical in many applications and the limited onboard power source makes completing tasks quickly even more important. Staying on a desired course while traveling at high speeds and high accelerations is difficult due to complex and [...]
Carnegie Mellon University
Person Transfers Between Multiple Service Robots
Abstract: As more service robots are deployed in the world, human-robot interaction will not be limited to one-to-one interactions between users and robots. Instead, users will likely have to interact with multiple robots, simultaneously or sequentially, throughout their day to receive services and complete different tasks. In this thesis, I describe work in which my [...]
A causal framework to diagnose and fix issues with doors
Abstract: Many animals, such as ravens, (and a fortiori humans) exhibit a great deal of physical intelligence that allows them to solve complex multi-step physical puzzles. This ability indicates an understanding or a faculty to represent causality and mechanisms, understand when something goes wrong, and figure out how to deal with it. As a step [...]
Carnegie Mellon University
Understanding Unbalanced Datasets Through Simple Models and Dataset Exploration
Abstract: Computer vision models have proven to be tremendously capable of recognizing and detecting several classes and objects. They succeed in classes widely ranging in type and scale from humans to cans to pens. However, the best performing classes have abundant examples in large-scale datasets today. In unbalanced datasets, where some categories are seen in [...]
Domain adaptive object detection
Abstract: Recent advances in deep learning have led to the development of accurate and efficient models for object detection. However, learning highly accurate models relies on the availability of large-scale annotated datasets. Due to this, model performance drops drastically when evaluated on label-scarce datasets having visually distinct images. Domain adaptation tries to mitigate this degradation. In [...]
Carnegie Mellon University
Understanding, Exploiting and Improving Inter-view Relationships
Abstract: Multi-view machine learning has garnered substantial attention in various applications over recent years. Many such applications involve learning on data obtained from multiple heterogeneous sources of information, for example, in multi-sensor systems such as self-driving cars, or monitoring intensive care patient vital signs at their bed-side. Learning models for such applications can often benefit [...]
Model-Centric Verification of Artificial Intelligence
Abstract: This work shows how provable guarantees can be used to supplement probabilistic estimates in the context of Artificial Intelligence (AI) systems. Statistical techniques measure the expected performance of a model, but low error rates say nothing about the ways in which errors manifest. Formal verification of model adherence to design specifications can yield certificates [...]
Designing Whisker Sensors to Detect Multiple Mechanical Stimuli for Robotic Applications
Abstract: Many mammals, such as rats and seals, use their whiskers as versatile mechanical sensors to gain precise information about their surroundings. Whisker-inspired sensors on robotic platforms have shown their potential benefit, improving applications ranging from drone navigation to texture mapping. Despite this, there is a gap between the engineered sensors and many of the [...]
Carnegie Mellon University
Human-in-the-loop Control of Mobile Robots
Abstract: Human-in-the-loop control for mobile robots is an important aspect of robot operation, especially for navigation in unstructured environments or in the case of unexpected events. However, traditional paradigms of human-in-the-loop control have relied heavily on the human to provide precise and accurate control inputs to the robot, or reduced the role of the human [...]
Visual Understanding across Semantic Groups, Domains and Devices
Abstract: Deep neural networks often lack generalization capabilities to accommodate changes in the input/output domain distributions and, therefore, are inherently limited by the restricted visual and semantic information contained in the original training set. In this talk, we argue the importance of the versatility of deep neural architectures and we explore it from various perspectives. [...]
Towards Robust Human-Robot Interaction: A Quality Diversity Approach
Abstract: The growth of scale and complexity of interactions between humans and robots highlights the need for new computational methods to automatically evaluate novel algorithms and applications. Exploring the diverse scenarios of interaction between humans and robots in simulation can improve understanding of complex human-robot interaction systems and avoid potentially costly failures in real-world settings. [...]