Student Talks
Carnegie Mellon University
MSR Thesis Talk – Zhaoyuan Fang
Title: Features in Extra Dimensions: Spatial and Temporal Scene Representations Abstract: Computer vision models have made great progress in featurizing pixels of images. However, an image is only a projection of the actual 3D scene: occlusions and perspective distortions exist. To arrive at a better representation of the scene itself, extra dimensions are needed to [...]
Carnegie Mellon University
MSR Thesis Talk – Yunchu Zhang
Title: Library of behaviors and tools for robot manipulation Abstract: Learned policies often fail to generalize across environment variations, such as, different objects, object arrangements, or camera viewpoints. Moreover, most policies are trained and tested in simulation environments, and the sim2real gap remains large under weak visual representations that do not disentangle the scene from [...]
Carnegie Mellon University
Learning Structured World Model for Deformable Object Manipulation
Abstract: Manipulation of deformable objects challenges common assumptions in robotic manipulation, such as low-dimension state representation, known dynamics, and minimal occlusion. Deformable objects have high intrinsic state representation, complex dynamics with high degrees of freedom, and severe self-occlusion. These properties make them difficult for state estimation and planning. In this thesis, we introduce benchmarks and [...]
Safe control under input limits with neural CBF
Abstract: In theory, control barrier functions (CBFs) provide a convenient means to construct provably safe controllers. However, a typical problem is that the constructed controller will exceed input limits, and merely clipping the inputs will break all safety guarantees. To address this practical flaw, we consider synthesizing a CBF that will respect input limits. We [...]
Carnegie Mellon University
MSR Thesis Talk – Chi Yen Lee
Title: Enhancing Quadruped Locomotion Stability with Reaction Wheel Systems and Model Predictive Control Zoom: https://cmu.zoom.us/j/96808397411?pwd=YnFDaFk1WVVyZjc5UndlOTBZL0tjUT09 Abstract: The development of quadruped robots offers a mobility solution that allows robot agents to navigate complicated terrains, making them extremely versatile robots in a variety of environments. Today, there are a number of research challenges facing quadruped development. First, the [...]
MSR Thesis Talk – Chu Er Pan
Title: 6D Object Pose Estimation for Manipulation via Weak Supervision Abstract: 6D object pose estimation is essential for robotic manipulation tasks. Existing learning-based pose estimators often rely on training from labeled absolute poses with fixed object canonical frames, which (1) requires datasets with annotations of object absolute pose that are resource-intensive to collect; (2) is hard [...]
Carnegie Mellon University
MSR Thesis Talk: Ruoyang Xu
Title: Using 3D Imaging Radar for Indoor Localization and Mapping Zoom: https://cmu.zoom.us/j/95090884062?pwd=dVZDVHJDTGVUWW9iSlJLTWtidThBUT09 Meeting ID: 950 9088 4062 Passcode: 411959 Abstract: 3D Imaging Radars offer robust perception capability through visually demanding environments due to the unique penetrative and reflective properties of millimeter waves. However, the utilization of Imaging radar for robot navigation and mapping remains under-explored due [...]
Carnegie Mellon University
MSR Thesis Talk: Gaurav Pathak
Title: Programmable light curtains for Safety Envelopes, SLAM and Navigation Abstract: Conventional robot perception and navigation pipelines are built using traditional sensors such as RGB cameras, stereo depth sensors and LiDARs.These sensors scan the entire scene in a fixed and uniform way. In contrast, programmable light curtains are a recently-invented, resource-efficient sensor that measure the [...]
Carnegie Mellon University
MSR Thesis Talk: Andrew VanOsten
Title: Lidar-Visual-Inertial Odometry via Modifications and Improvements to Super Odometry Abstract: The main focus of this thesis involves improvements and extensions to Super Odometry, a preexisting method for lidar-inertial odometry. This was done in the context of the DARPA RACER program as a member of Carnegie Mellon's DEAD Fast team, aiming to provide reliable [...]
Carnegie Mellon University
MSR Thesis Talk – Bassam Bikdash
Title: Boundary-Aware Demons Algorithm with Applications in Electronic Waste Recycling Abstract Electronic waste (e-waste) refers to electronic devices that are nearing the end of their useful life, and are discarded, donated, or given away. Valuable metallic and plastic components in e-waste (gold, silver, platinum) is estimated to value upwards of $60 billion and although e-waste represents [...]
Carnegie Mellon University
MSR Thesis Talk – Mary Hatfalvi
Title: Introspective Perception through Identifying Blur, Light Direction, and Angle-of-View Abstract Robotic perception tasks have achieved great performance, especially in autonomous vehicles and robot assistance. However, we still often do not understand how and when perception tasks fail. Researchers have achieved some success in creating introspective perception systems that detect when perception tasks will fail, [...]
Carnegie Mellon University
MSR Thesis Talk: Bowei Chen
Title: Image Synthesis with Appearance Decomposition Abstract: Our visual world is compositional and its appearance can be decomposed into various components. Leveraging these components can be beneficial for challenging image synthesis tasks. To this end, this thesis focuses on studying how appearance decomposition can improve image synthesis methods using two examples. (1) Structural decomposition: we introduce [...]
MSR Thesis Talk: Ivan Cisneros
Title: A VPR-Based Technique for UAV Localization In Unseen Environments Abstract: Unmanned Aerial Vehicles (UAVs) primarily rely on GPS-assisted localization and navigation due to the accessibility and ubiquity of such systems. However, this presents a potentially catastrophic single point of failure that may prevent autonomous UAVs from becoming truly reliable, as GPS is prone to dropout, [...]
MSR Thesis Talk: Yehonathan Litman
Title: GPS-Denied Global Visual-Inertial Ground Vehicle State Estimation via Image Registration Abstract: Robotic systems such as unmanned ground vehicles (UGVs) often depend on GPS for navigation in outdoor environments. In GPS-denied environments, one approach to maintain a global state estimate is localizing based on preexisting georeferenced aerial or satellite imagery. However, this is inherently challenged [...]
Carnegie Mellon University
MSR Thesis Talk – George Cazenavette
Title: Learning to Distill Datasets by Matching Expert Training Trajectories Project Page: https://georgecazenavette.github.io/mtt-distillation/ Abstract: Dataset distillation is the task of synthesizing a small dataset such that a model trained on the synthetic set will match the test accuracy of the model trained on the full dataset. In this talk, we review 3 several of our recent [...]
Carnegie Mellon University
MSR Thesis Talk – Zongyue Zhao
Title: Coordinating Heterogeneous Teams for Urban Search and Rescue Abstract: The mission of Urban Search and Rescue (USAR) has drawn significant interest in robotics. Autonomous entities must be able to share knowledge efficiently to address visibility and collaboration challenges in a complex environment shortly after structural collapse catastrophes. In this thesis, we present methods to coordinate [...]
Carnegie Mellon University
MSR Thesis Talk – Jeff Hu
Title: Composition Learning in “Modular” Robot Systems Abstract: Modular robot and multi-robot systems share a concept in common: composition, i.e. the study of how parts can be combined so they can be used to achieve certain objectives. Our vision is to enable robotic systems to configure and reconfigure themselves during field deployment, either autonomously or [...]
Carnegie Mellon University
MSR Thesis Talk – Tom Bu
Title: Towards HD Map Updates With Crosswalk Change Detection From Vehicle-mounted Cameras Zoom: https://cmu.zoom.us/j/4452379705 Abstract: Many autonomous vehicles rely on high-definition maps that contain road layout and road semantics as priors for perception, planning and prediction. However, these maps can become stale over time as the road environment changes. This thesis develops a road monitoring framework [...]
MSR Thesis Talk: Zilin Si
Title: Taxim: An Example-based Simulation Model for GelSight Tactile Sensors and its Sim-to-Real Applications Location: NSH 4305 or Zoom https://cmu.zoom.us/j/91769761787?pwd=cGZ2RElKMVJaQ1NVNG5BdFQ0Ny9uQT09 Abstract: Simulation is widely used in robotics for system verification and large-scale data collection. However, simulating a robot system efficiently and with high fidelity, from sensing, perception to manipulation, has been a long-standing challenge. Tactile sensing, as [...]
Carnegie Mellon University
MSR Thesis Talk – Benjamin Jensen
Title: A Low-Cost Attitude Determination and Control System and Hardware-in-the-Loop Testbed for CubeSats Zoom: https://cmu.zoom.us/j/92654622790?pwd=d0pYcTJ4K0xzdmYvUHFYWC9lMDBhQT09 Abstract: Since their initial development in the late 1990s, CubeSats have quickly grown popular due to their relatively low cost and short development period. However, CubeSat launches are prone to failure, with less than half of CubeSats completely fulfilling their [...]
Carnegie Mellon University
MSR Thesis Talk – Swapnil Pande
Title: Driving by Dreaming: Offline Model-Based Reinforcement Learning for Motion Planning for Autonomous Vehicles Abstract: While there has been significant progress in deploying autonomous vehicles (AVs) in urban driving settings, there remains a long-tail of challenging motion planning scenarios that must be addressed before truly driverless operation is possible. The current paradigm for motion planner [...]
Carnegie Mellon University
MSR Thesis Talk – Alvin Shek
Title: Learning from Physical Human Feedback: An Object-Centric One-Shot Adaptation Method Abstract: For robots to be effectively deployed in novel environments and tasks, they must be able to understand the feedback expressed by humans during intervention. This can either correct undesirable behavior or indicate additional preferences. Existing methods either require repeated episodes of interactions or [...]
Carnegie Mellon University
MSR Thesis Talk: Jiaqi Geng
Title: Dense Human Pose Estimation From WiFi Abstract: Advances in computer vision and machine learning techniques have led to significant development in 2D and 3D human pose estimation from RGB cameras, LiDAR, and radars. However, human pose estimation from images is adversely affected by occlusion and lighting, which are common in many scenarios of interest. [...]
Carnegie Mellon University
MSR Thesis Talk: Jianchun Chen
Title: An efficient approach for sequential shape human performance capture from monocular video Abstract: Human performance capture from RGB videos in unconstrained environments has become very popular for applications to generate virtual avatars or digital actors. Modern approaches rely on neural network algorithms to estimate geometry directly from images, resulting in a coarse representation of [...]
Thermal Management Considerations For Lunar Polar Micro-Rovers
Meeting ID: 940 0396 4889 Passcode: 906118 Abstract: This research addresses the significant and unprecedented challenge of thermal regulation for lunar polar micro-rovers. These are distinct from priors by way of very small size, mass, and power, but particularly for the extremes of ambient environment in which they must operate. On the lunar poles, rovers experience temperatures [...]
Carnegie Mellon University
MSR Thesis Talk: Zhihao Zhang
Title: Efficient Methods for Model Performance Inference Abstract: A key challenge in neural architecture search (NAS) is quickly inferring the predictive performance of a broad spectrum of neural networks to discover statistically accurate and computationally efficient ones. We refer to this task as model performance inference (MPI). The current practice for efficient MPI is gradient-based methods [...]
Carnegie Mellon University
MSR Thesis Talk: Chufan Gao
Title: Addressing Time-series Signal Quality in Healthcare Data Abstract: Healthcare data time-series signal quality assessment (SQA) plays a vital role in the accuracy and reliability of machine learning algorithms to analyze health metrics. However, these signals are often corrupted with different kinds of noises and artifacts, including Baseline Wander, Muscle Artifacts, Powerline Interference, and Equipment Failure. This [...]
Carnegie Mellon University
Object Pose Estimation without Direct Supervision
Abstract: Currently, robot manipulation is a special purpose tool, restricted to isolated environments with a fixed set of objects. In order to make robot manipulation more general, robots need to be able to perceive and interact with a large number of objects in cluttered scenes. Traditionally, object pose has been used as a representation to [...]
Improving Robotic Exploration with Self-Supervision and Diverse Data
Abstract: Reinforcement learning (RL) holds great promise for improving robotics, as it allows systems to move beyond passive learning and interact with the world while learning from these interactions. A key aspect of this interaction is exploration: which actions should an RL agent take to best learn about the world? Prior work on exploration is typically [...]
An Extension to Model Predictive Path Integral Control and Modeling Considerations for Off-road Autonomous Driving in Complex Environment
Abstract: The ability to traverse complex environments and terrains is critical to autonomously driving off-road in a fast and safe manner. Challenges such as terrain navigation and vehicle rollover prevention become imperative due to the off-road vehicle configuration and the operating environment itself. This talk will introduce some of these challenges and the different tools [...]
Carnegie Mellon University
Heuristic Search Based Planning by Minimizing Anticipated Search Efforts
Abstract: We focus on relatively low dimensional robot motion planning problems, such as planning for navigation of a self-driving vehicle, unmanned aerial vehicles (UAVs), and footstep planning for humanoids. In these problems, there is a need for fast planning, potentially compromising the solution quality. Often, we want to plan fast but are also interested in [...]
Combining Offline Reinforcement Learning with Stochastic Multi-Agent Planning for Autonomous Driving
Abstract: Fully autonomous vehicles have the potential to greatly reduce vehicular accidents and revolutionize how people travel and how we transport goods. Many of the major challenges for autonomous driving systems emerge from the numerous traffic situations that require complex interactions with other agents. For the foreseeable future, autonomous vehicles will have to share the [...]
Human-to-Robot Imitation in the Wild
Abstract: In this talk, I approach the problem of learning by watching humans in the wild. While traditional approaches in Imitation and Reinforcement Learning are promising for learning in the real world, they are either sample inefficient or are constrained to lab settings. Meanwhile, there has been a lot of success in processing passive, unstructured human [...]
Causal Robot Learning for Manipulation
Abstract: Two decades into the third age of AI, the rise of deep learning has yielded two seemingly disparate realities. In one, massive accomplishments have been achieved in deep reinforcement learning, protein folding, and large language models. Yet, in the other, the promises of deep learning to empower robots that operate robustly in real-world environments [...]
Dense Reconstruction of Dynamic Structures from Monocular RGB Videos
Abstract: We study the problem of 3D reconstruction of {\em generic} and {\em deformable} objects and scenes from {\em casually-taken} RGB videos, to create a system for capturing the dynamic 3D world. Being able to reconstruct dynamic structures from casual videos allows one to create avatars and motion references for arbitrary objects without specialized devices, [...]
Differentiable Collision Detection
Abstract: Collision detection between objects is critical for simulation, control, and learning for robotic systems. However, existing collision detection routines are inherently non-differentiable, limiting their applications in gradient-based optimization tools. In this talk, I present DCOL: a fast and fully differentiable collision-detection framework that reasons about collisions between a set of composable and highly expressive [...]
On Interaction, Imitation, and Causation
Abstract: A standard critique of machine learning models (especially neural networks) is that they pick up on spurious correlations rather than causal relationships and are therefore brittle in the face of distribution shift. Solving this problem in full generality is impossible (i.e. there might be no good way to distinguish between the two). However, if [...]
Learning via Visual-Tactile Interaction
Abstract: Humans learn by interacting with their surroundings using all of their senses. The first of these senses to develop is touch, and it is the first way that young humans explore their environment, learn about objects, and tune their cost functions (via pain or treats). Yet, robots are often denied this highly informative and [...]
Carnegie Mellon University
Accelerating Numerical Methods for Optimal Control
Abstract: Many modern control methods, such as model-predictive control, rely heavily on solving optimization problems in real time. In particular, the ability to efficiently solve optimal control problems has enabled many of the recent breakthroughs in achieving highly dynamic behaviors for complex robotic systems. The high computational requirements of these algorithms demand novel algorithms tailor-suited [...]
Tactile SLAM: perception for dexterity via vision-based touch
Abstract: Touch provides a direct window into robot-object interaction, free from occlusion and aliasing faced by visual sensing. Collated tactile perception can facilitate contact-rich tasks---like in-hand manipulation, sliding, and grasping. Here, online estimates of object geometry and pose are crucial for downstream planning and control. With significant advances in tactile sensing, like vision-based touch, a [...]
Resource Allocation for Learning in Robotics
Abstract: Robots operating in the real world need fast and intelligent decision making systems. While these systems have traditionally consisted of human-engineered behaviors and world models, there has been a lot of interest in integrating them with data-driven components to achieve faster execution and reduce hand-engineering. Unfortunately, these learning-based methods require large amounts of training [...]
Planning with Dynamics by Interleaving Search and Trajectory Optimization
Abstract: Search-based planning algorithms enable autonomous agents like robots to come up with well-reasoned long-horizon plans to achieve a given task objective. They do so by searching over the graph that results from discretizing the state and action space. However, in robotics, several dynamically rich tasks require high-dimensional planning in the continuous space. For such [...]
Solving Constraint Tasks with Memory-Based Learning
Abstract: In constraint tasks, the current task state heavily limits what actions are available to an agent. Mechanical constraints exist in many common tasks such as construction, disassembly, and rearrangement and task space constraints exist in an even broader range of tasks. Deep reinforcement learning algorithms have typically struggled with constraint tasks for two main [...]
Head-Worn Assistive Teleoperation of Mobile Manipulators
Abstract: Mobile manipulators in the home can provide increased autonomy to individuals with severe motor impairments, who often cannot complete activities of daily living (ADLs) without the help of a caregiver. Teleoperation of an assistive mobile manipulator could enable an individual with motor impairments to independently perform self-care and household tasks, yet limited motor function [...]
Text Classification with Class Descriptions Only
Abstract: In this work, we introduce KeyClass, a weakly-supervised text classification framework that learns from class-label descriptions only, without the need to use any human-labeled documents. It leverages the linguistic domain knowledge stored within pre-trained language models and data programming to automatically label documents. We demonstrate its efficacy and flexibility by comparing it to state-of-the-art [...]
Multi-Object Tracking in the Crowd
Abstract: In this talk, I will focus on the problem of multi-object tracking in crowded scenes. Tracking within crowds is particularly challenging due to heavy occlusion and frequent crossover between tracking targets. The problem becomes more difficult when we only have noisy bounding boxes due to background and neighboring objects. Existing tracking methods try to [...]
Utilizing Panoptic Segmentation and a Locally-Conditioned Neural Representation to Build Richer 3D Maps
Abstract: Advances in deep-learning based perception and maturation of volumetric RGB-D mapping algorithms have allowed autonomous robots to be deployed in increasingly complex environments. For robust operation in open-world conditions however, perceptual capabilities are still lacking. Limitations of commodity depth sensors mean that complex geometries and textures cannot be reconstructed accurately. Semantic understanding is still [...]
Magnification-invariant retinal distance estimation using a laser aiming beam
Abstract: Retinal surgery procedures like epiretinal membrane peeling and retinal vein cannulation require surgeons to manipulate very delicate structures in the eye with little room for error. Many robotic surgery systems have been developed to help surgeons and enforce safeguards during these demanding procedures. One essential piece of information that is required to create and [...]
Bridging Humans and Generative Models
Abstract: Deep generative models make visual content creation more accessible to novice and professional users alike by automating the synthesis of diverse, realistic content based on a collected dataset. People often use generative models as data-driven sources, making it challenging to personalize a model easily. Currently, personalizing a model requires careful data curation, which is [...]
Impulse considerations for reasoning about intermittent contacts
Abstract: Many of our interactions with the environment involve making and breaking contacts. However, it is not always obvious how one should reason about these intermittent contacts (sequence, timings, locations) in an online and adaptive way. This is particularly relevant in gait generation for legged locomotion control, where it is standard to simply predefine and [...]
Multi-Human 3D Reconstruction from Monocular RGB Videos
Abstract: We study the problem of multi-human 3D reconstruction from RGB videos captured in the wild. Humans have dynamic motion, and reconstructing them in arbitrary settings is key to building immersive social telepresence, assistive humanoid robots, and augmented reality systems. However, creating such a system requires addressing fundamental issues with previous works regarding the data [...]
Learning and Translating Temporal Abstractions across Humans and Robots
Abstract: Humans possess a remarkable ability to learn to perform tasks from a variety of different sources-from language, instructions, demonstration, etc. In each case, they are able to easily extract the high-level strategy to solve the task, such as the recipe of cooking a dish, whilst ignoring irrelevant details, such as the precise shape of [...]
Robust Incremental Smoothing and Mapping
Abstract: In this work we present a method for robust optimization for online incremental Simultaneous Localization and Mapping (SLAM). Due to the NP-Hardness of data association in the presence of perceptual aliasing, tractable (approximate) approaches to data association will produce erroneous measurements. We require SLAM back-ends that can converge to accurate solutions in the presence [...]
Carnegie Mellon University
3D Reconstruction using Differential Imaging
Abstract: 3D reconstruction has been at the core of many computer vision applications, including autonomous driving, visual inspection in manufacturing, and augmented and virtual reality (AR/VR). Because monocular 3D sensing is fundamentally ill-posed, many techniques aiming for accurate reconstruction use multiple captures to solve the inverse problem. Depending on the amount of change in these [...]
Learning with Structured Priors for Robust Robot Manipulation
Abstract: Robust and generalizable robots that can autonomously manipulate objects in semi-structured environments can bring material benefits to society. Data-driven learning approaches are crucial for enabling such systems by identifying and exploiting patterns in semi-structured environments, allowing robots to adapt to novel scenarios with minimal human supervision. However, despite significant prior work in learning for [...]
Learning Parameter-Efficient Quadrotor Dynamics Models
Abstract: Operation of quadrotors through high-speed, high-acceleration maneuvers remains a challenging problem due to the complex aerodynamics in this regime. While standard physical models suffice for control in near-hover conditions, the primary challenge in executing aggressive trajectories is obtaining a model for the quadrotor dynamics that adequately models the aerodynamic effects present, including lift, drag, [...]
Carnegie Mellon University
Self-Supervising Occlusions For Vision
Abstract: Virtually every scene has occlusions. Even a scene with a single object exhibits self-occlusions - a camera can only view one side of an object (left or right, front or back), or part of the object is outside the field of view. More complex occlusions occur when one or more objects block part(s) of [...]
Predicting The Future and Linking the Past: Learning and Constructing Structured Models for Robotic Manipulation
Abstract: Intelligent robotic agents need to reason about the dynamics of their surrounding world, and use such dynamics reasoning to make future predictions for efficient task planning. In addition, it is also desirable for robots to associate past experience in their memories to their current observation, and conduct analogical reasoning to complete tasks at their [...]
Carnegie Mellon University
MSR Thesis Talk: Tushar Kusnur
Title: Search-based Planning for Sensor-based Coverage Abstract: Robots are excellent candidates for the dull, dirty, and dangerous jobs we do not want humans to perform. Today, these include inspection of large areas or structures, post-disaster assessment, and surveillance. Assessing the aftermath of the recent Fern Hollow bridge collapse in Pittsburgh is one such example. Many [...]
Human-in-the-loop Model Creation
Abstract: Deep generative models make visual content creation more accessible to novice users by automating the synthesis of diverse, realistic content based on a collected dataset. However, the current machine learning approaches miss several elements of the creative process -- the ability to synthesize things that go far beyond the data distribution and everyday experience, [...]
Robotic Interestingness via Human-Informed Few-Shot Object Detection
Abstract: Interestingness recognition is crucial for decision making in autonomous exploration for mobile robots. Previous methods proposed an unsupervised online learning approach that can adapt to environments and detect interesting scenes quickly, but lack the ability to adapt to human-informed interesting objects. To solve this problem, we introduce a human-interactive framework, AirInteraction, that can detect [...]
Carnegie Mellon University
MSR Thesis Talk: Nikhil Angad Bakshi
Title: See But Don't Be Seen: Towards Stealthy Active Search in Heterogeneous Multi-Robot Systems Abstract: Robotic solutions for quick disaster response are essential to ensure minimal loss of life, especially when the search area is too dangerous or too vast for human rescuers. We model this problem as an asynchronous multi-agent active-search task where each robot aims [...]
Carnegie Mellon University
MSR Thesis Talk: Yves Georgy Daoud
Title: Spatial Tasking in Human-Robot Collaborative Exploration Abstract: This work develops a methodology for collaborative human-robot exploration that leverages implicit coordination. Most autonomous single- and multi-robot exploration systems require a remote operator to provide explicit guidance to the robot team. Few works consider how to integrate the human partner alongside robots to provide guidance in the [...]
Carnegie Mellon University
MSR Thesis Talk: Ambareesh Revanur
Title: Towards Video-based Physiology Estimation Abstract: RGB-video based human physiology estimation has a wide range of practical applications in telehealth, sports and deep fake detection. Therefore, researchers in the community have collected several video datasets and have advanced new methods over the years. In this dissertation, we study these methods extensively and aim to address the [...]
Carnegie Mellon University
MSR Thesis Talk: Raghavv Goel
Title: Automating Ultrasound Based Vascular Access Abstract: Timely care of trauma patients is important to prevent casualties in resource-limited regions such as the battlefield. In order to treat such trauma using point of care diagnosis, medical practitioners typically use an ultrasound for vascular access or detection of subcutaneous splinters for providing critical care. The problem here is two-fold: [...]
Carnegie Mellon University
MSR Thesis Talk: Mayank Singh
Title: Analogical Networks: Memory-Modulated In-Context 3D Parsing Abstract: Recent advances in the applications of deep neural networks to numerous visual perception tasks have shown excellent performance. However, this generally requires access to large amount of training samples and hence one persistent challenge is the setting of few-shot learning. In most existing works, a separate parametric neural [...]
Carnegie Mellon University
Learning with Diverse Forms of Imperfect and Indirect Supervision
Abstract: Powerful Machine Learning (ML) models trained on large, annotated datasets have driven impressive advances in fields including natural language processing and computer vision. In turn, such developments have led to impactful applications of ML in areas such as healthcare, e-commerce, and predictive maintenance. However, obtaining annotated datasets at the scale required for training high [...]
Carnegie Mellon University
MSR Thesis Talk: Yutian Lei
Title: ARC: AdveRsarial Calibration between Modalities Abstract: Advances in computer vision and machine learning techniques have led to flourishing success in RGB-input perception tasks, which has also opened unbounded possibilities for non-RGB-input perception tasks, such as object detection from wireless signals, point clouds, and infrared light. However, compared to the matured development pipeline of RGB-input [...]
FRIDA: Supporting Artistic Communication in Real-World Image Synthesis Through Diverse Input Modalities
Abstract: FRIDA, a Framework and Robotics Initiative for Developing Arts, is a robot painting system designed to translate an artist's high-level intentions into real world paintings. FRIDA can paint from combinations of input images, text, style examples, sounds, and sketches. Planning is performed in a differentiable, simulated environment created using real data from the robot [...]
Perception for High-Speed Off-Road Driving
Abstract: On-road autonomous driving has seen rapid progress in recent years with driverless vehicles being tested in various cities worldwide. However, this progress is limited to cities with well-established infrastructure and has yet to transfer to off-road regimes with unstructured environments and few paved roads. Advances in high-speed and reliable autonomous off-road driving can unlock [...]
Continual Learning of Compositional Skills for Robust Robot Manipulation
Abstract: Real world robots need to continuously learn new manipulation tasks in a lifelong learning manner. These new tasks often share sub-structures (in the form of sub-tasks, controllers) with previously learned tasks. To utilize these shared sub-structures, we explore a compositional and object-centric approach to learn manipulation tasks. While compositionality in robot manipulation can manifest [...]
Carnegie Mellon University
MSR Thesis Talk: Samuel Ong
Title: Data-Driven Slip Model for Improved Localization and Path Following applied to Lunar Micro-Rovers Abstract Micro-lunar rovers need to solve a slew of challenges on the Moon, with no human intervention. One such challenge is the need to know their location in order to navigate and build maps. However, localization is challenging on the moon due [...]
Computational Interferometric Imaging
Abstract: Imaging systems typically accumulate photons that, as they travel from a light source to a camera, follow multiple different paths and interact with several scene objects. This multi-path accumulation process confounds the information that is available in captured images about the scene and makes using these images to infer properties of scene objects, such [...]
Robust and Context-Aware Real-Time Collaborative Robot Handling with Dynamic Gesture Commands
Abstract: Real-time collaborative robot (cobot) handling is a task where the cobot maneuvers an object under human dynamic gesture commands. Enabling dynamic gesture commands is useful when the human needs to avoid direct contact with the robot or the object handled by the robot. However, the key challenge lies in the heterogeneity in human behaviors [...]
Equivalent Policy Sets for Learning Aligned Models and Abstractions
Abstract: Recent successes in model-based reinforcement learning (MBRL) have demonstrated the enormous value that learned representations of environmental dynamics (i.e., models) can impart to autonomous decision making. While a learned model can never perfectly represent the dynamics of complex environments, models that are accurate in the "right” ways may still be highly useful for decision [...]
Dynamic Route Guidance in Vehicle Networks by Simulating Future Traffic Patterns
Abstract: Roadway congestion leads to wasted time and money and environmental damage. Since adding more roadway capacity is often not possible in urban environments, it is becoming more important to use existing road networks more efficiently. Toward this goal, recent research in real-time, schedule-driven intersection control has shown an ability to significantly reduce the delays [...]
Adaptive Robotic Assistance through Observations of Human Behavior
Abstract: Assistive robots should take actions that support people's goals. This is especially true as robots enter into environments where personal agency is paramount, such as a person's home. Home environments have a wide variety of "optimal' solutions that depend on personal preference, making it difficult for a robot to know the goal it should [...]
Beyond Pick-and-Place: Towards Dynamic and Contact-rich Motor Skills with Reinforcement Learning
Abstract: Interactions with the physical world are at the core of robotics. However, robotics research, especially in manipulation, has been mainly focused on tasks with limited interactions with the physical world such as pick-and-place or pushing objects on the table top. These interactions are often quasi-static, have predefined or limited sequence of contact events and [...]
Adaptive-Anytime Planning and Mapping for Multi-Robot Exploration in Large Environments
Abstract: Robotic systems are being leveraged to explore environments too hazardous for humans to enter. Robot sensing, compute, and kinodynamic (SCK) capabilities are inextricably tied to the size, weight, and power (SWaP) constraints of the vehicle. When designing a robot team for exploration, the diversity and types of robots used must be carefully considered because [...]
Neural Radiance Fields with LiDAR Maps
Abstract: Maps, as our prior understanding of the environment, play an essential role for many modern robotic applications. The design of maps, in fact, is a non-trivial art of balance between storage and richness. In this thesis, we explored map compression for image-to-LiDAR registration, LiDAR-to-LiDAR map registration, and image-to-SfM map registration, and finally, inspired by [...]
Enabling Data-Efficient Real-World Model-Based Manipulation by Estimating Preconditions for Inaccurate Models
Abstract: This thesis explores estimating and reasoning about model deviation in robot learning for manipulation to improve data efficiency and reliability to enable real-robot manipulation in a world where models are inaccurate but still useful. Existing strategies are presented for improving planning robustness with low amounts of real-world data by an empirically estimated model precondition to guide [...]
Robust Adaptive Reinforcement Learning for Safety Critical Applications via Curricular Learning
Abstract: Reinforcement Learning (RL) presents great promises for autonomous agents. However, when using robots in a safety critical domain, a system has to be robust enough to be deployed in real life. For example, the robot should be able to perform across different scenarios it will encounter. The robot should avoid entering undesirable and irreversible [...]
MSR Thesis Talk: Yichen Li
Title: Simulation-guided Design for Vision-based Tactile Sensing on a Soft Robot Finger Abstract: Soft pneumatic robot manipulators have garnered widespread interest due to their compliance and flexibility, which enable soft, non-destructive grasping and strong adaptability to complex working environments. Tactile sensing is crucial for these manipulators to provide real-time contact information for control and manipulation. [...]
Controllable Visual-Tactile Synthesis
Abstract: Deep generative models have various content creation applications such as graphic design, e-commerce, and virtual Try-on. However, current works mainly focus on synthesizing realistic visual outputs, often ignoring other sensory modalities, such as touch, which limits physical interaction with users. The main challenges for multi-modal synthesis lie in the significant scale discrepancy between vision [...]
Perceiving Particles Inside a Container using Dynamic Touch Sensing
Abstract: Dynamic touch sensing has shown potential for multiple tasks. In this talk, I will present how we utilize dynamic touch sensing to perceive particles inside a container with two tasks: classification of the particles inside a container and property estimation of the particles inside a container. First, we try to recognize what is inside [...]
Towards Photorealistic Dynamic Capture and Animation of Human Hair and Head
Abstract: Realistic human avatars play a key role in immersive virtual telepresence. To reach a high level of realism, a human avatar needs to faithfully reflect human appearance. A human avatar should also be drivable and express natural motions. Existing works have made significant progress on building drivable realistic face avatars, but they rarely include [...]
Carnegie Mellon University
System Identification and Control of Multiagent Systems Through Interactions
Abstract: This thesis investigates the problem of inferring the underlying dynamic model of individual agents of a multiagent system (MAS) and using these models to shape the MAS's behavior using robots extrinsic to the MAS. We investigate (a) how an observer can infer the latent task and inter-agent interaction constraints from the agents' motion and [...]
Examining the Role of Adaptation in Human-Robot Collaboration
Abstract: Human and AI partners increasingly need to work together to perform tasks as a team. In order to act effectively as teammates, collaborative AI should reason about how their behaviors interplay with the strategies and skills of human team members as they coordinate on achieving joint goals. This talk will discuss a formalism for [...]
A Multi-view Synthetic and Real-world Human Activity Recognition Dataset
Abstract: Advancements in Human Activity Recognition (HAR) partially relies on the creation of datasets that cover a broad range of activities under various conditions. Unfortunately, obtaining and labeling datasets containing human activity is complex, laborious, and costly. One way to mitigate these difficulties with sufficient generality to provide robust activity recognition on unseen data is [...]
Eye Gaze for Intelligent Driving
Abstract: Intelligent vehicles have been proposed as one path to increasing vehicular safety and reduce on-road crashes. Driving intelligence has taken many forms, ranging from simple blind spot occupancy or forward collision warnings to lane keeping and all the way to full driving autonomy in certain situations. Primarily, these methods are outward-facing and operate on [...]
Dense 3D Representation Learning for Geometric Reasoning in Manipulation Tasks
Abstract: When solving a manipulation task like "put away the groceries" in real environments, robots must understand what *can* happen in these environments, as well as what *should* happen in order to accomplish the task. This knowledge can enable downstream robot policies to directly reason about which actions they should execute, and rule out behaviors [...]