Continual Learning of Compositional Skills for Robust Robot Manipulation
Abstract: Real world robots need to continuously learn new manipulation tasks in a lifelong learning manner. These new tasks often share many sub-structures e.g. sub-tasks, controllers, preconditions, with previously learned tasks. To utilize these shared sub-structures, we explore a compositional and object-centric approach to learn manipulation tasks. The first part of this thesis focuses on [...]
Towards Energy-Efficient Techniques and Applications for Universal AI Implementation
Abstract: The rapid advancement of large-scale language and vision models has significantly propelled the AI domain. We now see AI enriching everyday life in numerous ways – from community and shared virtual reality experiences to autonomous vehicles, healthcare innovations, and accessibility technologies, among others. Central to these developments is the real-time implementation of high-quality deep [...]
RI Faculty Business Meeting
Meeting for RI Faculty. Discussions include various department topics, policies, and procedures. Generally meets weekly.
Watch, Practice, Improve: Towards In-the-wild Manipulation
Abstract: The longstanding dream of many roboticists is to see robots perform diverse tasks in diverse environments. To build such a robot that can operate anywhere, many methods train on robotic interaction data. While these approaches have led to significant advances, they rely on heavily engineered setups or high amounts of supervision, neither of which [...]
Structure-from-Motion Meets Self-supervised Learning
Abstract: How to teach machine to perceive 3D world from unlabeled videos? We will present new solution via incorporating Structure-from-Motion (SfM) into self-supervised model learning. Given RGB inputs, deep models learn to regress depth and correspondence. With the two inputs, we introduce a camera localization algorithm that searches for certified global optimal poses. However, the [...]
Combining Physics-Based Light Transport and Neural Fields for Robust Inverse Rendering
Abstract: Inverse rendering — the process of recovering shape, material, and/or lighting of an object or environment from a set of images — is essential for applications in robotics and elsewhere, from AR/VR to perception on self-driving vehicles. While it is possible to perform inverse rendering from color images alone, it is often far easier [...]
Improving the Transparency of Agent Decision Making to Humans Using Demonstrations
Abstract: For intelligent agents (e.g. robots) to be seamlessly integrated into human society, humans must be able to understand their decision making. For example, the decision making of autonomous cars must be clear to the engineers certifying their safety, passengers riding them, and nearby drivers negotiating the road simultaneously. As an agent's decision making depends [...]
Robotic Climbing for Extreme Terrain Exploration
Abstract: Climbing robots can operate in steep and unstructured environments that are inaccessible to other ground robots, with applications ranging from the inspection of artificial structures on Earth to the exploration of natural terrain features throughout the solar system. Climbing robots for planetary exploration face many challenges to deployment, including mass restrictions, irregular surface features, [...]
Layout Design for Large-Scale Multi-Robot Coordination
Abstract: Today, thousands of robots are navigating autonomously in warehouses, transporting goods from one location to another. While numerous planning algorithms are developed to coordinate robots more efficiently and robustly, warehouse layouts remain largely unchanged – they still adhere to the traditional pattern designed for human workers rather than robots. In this talk, I will [...]
Perception amidst interaction: spatial AI with vision and touch for robot manipulation
Abstract: Robots currently lack the cognition to replicate even a fraction of the tasks humans do, a trend summarized by Moravec's Paradox. Humans effortlessly combine their senses for everyday interactions—we can rummage through our pockets in search of our keys, and deftly insert them to unlock our front door. Before robots can demonstrate such dexterity, [...]
Toward Human-Centered XR: Bridging Cognition and Computation
Abstract: Virtual and Augmented Reality enables unprecedented possibilities for displaying virtual content, sensing physical surroundings, and tracking human behaviors with high fidelity. However, we still haven't created "superhumans" who can outperform what we are in physical reality, nor a "perfect" XR system that delivers infinite battery life or realistic sensation. In this talk, I will discuss some of our [...]
Carnegie Mellon Graphics Colloquium: C. Karen Liu : Building Large Models for Human Motion
Building Large Models for Human Motion Large generative models for human motion, analogous to ChatGPT for text, will enable human motion synthesis and prediction for a wide range of applications such as character animation, humanoid robots, AR/VR motion tracking, and healthcare. This model would generate diverse, realistic human motions and behaviors, including kinematics and dynamics, [...]
Teaching a Robot to Perform Surgery: From 3D Image Understanding to Deformable Manipulation
Abstract: Robot manipulation of rigid household objects and environments has made massive strides in the past few years due to the achievements in computer vision and reinforcement learning communities. One area that has taken off at a slower pace is in manipulating deformable objects. For example, surgical robotics are used today via teleoperation from a [...]
Zeros for Data Science
Abstract: The world around us is neither totally regular nor completely random. Our and robots’ reliance on spatiotemporal patterns in daily life cannot be over-stressed, given the fact that most of us can function (perceive, recognize, navigate) effectively in chaotic and previously unseen physical, social and digital worlds. Data science has been promoted and practiced [...]
RI Faculty Business Meeting
Meeting for RI Faculty. Discussions include various department topics, policies, and procedures. Generally meets weekly.
Emotion perception: progress, challenges, and use cases
Abstract: One of the challenges Human-Centric AI systems face is understanding human behavior and emotions considering the context in which they take place. For example, current computer vision approaches for recognizing human emotions usually focus on facial movements and often ignore the context in which the facial movements take place. In this presentation, I will [...]
[MSR Thesis Talk] SplaTAM: Splat, Track & Map 3D Gaussians for Dense RGB-D SLAM
Abstract: Dense simultaneous localization and mapping (SLAM) is crucial for numerous robotic and augmented reality applications. However, current methods are often hampered by the non-volumetric or implicit way they represent a scene. This talk introduces SplaTAM, an approach that leverages explicit volumetric representations, i.e., 3D Gaussians, to enable high-fidelity reconstruction from a single unposed RGB-D [...]
Language: You’ve probably heard of it, read it, written it, gestured it, mimed it… Why can’t robots?
Abstract: Language is how meaning is conveyed between humans, and now the basis of foundation models. By implication, it's the most important modality for all of AGI and will replace the entire robotics control stack as the most important thing for all of us to work on.
RI Faculty Business Meeting
Meeting for RI Faculty. Discussions include various department topics, policies, and procedures. Generally meets weekly.
Foundation Models for Robotic Manipulation: Opportunities and Challenges
Abstract: Foundation models, such as GPT-4 Vision, have marked significant achievements in the fields of natural language and vision, demonstrating exceptional abilities to adapt to new tasks and scenarios. However, physical interaction—such as cooking, cleaning, or caregiving—remains a frontier where foundation models and robotic systems have yet to achieve the desired level of adaptability and [...]
Learning with Less
Abstract: The performance of an AI is nearly always associated with the amount of data you have at your disposal. Self-supervised machine learning can help – mitigating tedious human supervision – but the need for massive training datasets in modern AI seems unquenchable. Sometimes it is not the amount of data, but the mismatch of [...]
Human Perception of Robot Failure and Explanation During a Pick-and-Place Task
Abstract: In recent years, researchers have extensively used non-verbal gestures, such as head and arm movements, to express the robot's intentions and capabilities to humans. Inspired by past research, we investigated how different explanation modalities can aid human understanding and perception of how robots communicate failures and provide explanations during block pick-and-place tasks. Through an in-person [...]
RI Faculty Business Meeting
Meeting for RI Faculty. Discussions include various department topics, policies, and procedures. Generally meets weekly.
Why We Should Build Robot Apprentices And Why We Shouldn’t Do It Alone
Abstract: For robots to be able to truly integrate human-populated, dynamic, and unpredictable environments, they will have to have strong adaptive capabilities. In this talk, I argue that these adaptive capabilities should leverage interaction with end users, who know how (they want) a robot to act in that environment. I will present an overview of [...]
Learning Distributional Models for Relative Placement
Abstract: Relative placement tasks are an important category of tasks in which one object needs to be placed in a desired pose relative to another object. Previous work has shown success in learning relative placement tasks from just a small number of demonstrations, when using relational reasoning networks with geometric inductive biases. However, such methods fail [...]
Robust Body Exposure (RoBE): A Graph-based Dynamics Modeling Approach to Manipulating Blankets over People
Abstract: Robotic caregivers could potentially improve the quality of life of many who require physical assistance. However, in order to assist individuals who are lying in bed, robots must be capable of dealing with a significant obstacle: the blanket or sheet that will almost always cover the person's body. We propose a method for targeted [...]
Exploration for Continually Improving Robots
Abstract: General purpose robots should be able to perform arbitrary manipulation tasks, and get better at performing new ones as they obtain more experience. The current paradigm in robot learning involves imitation or simulation. Scaling these approaches to learn from more data for various tasks is bottle-necked by human labor required either in collecting demonstrations [...]
Sparse-view 3D in the Wild
Abstract: Reconstructing 3D scenes and objects from images alone has been a long-standing goal in computer vision. We have seen tremendous progress in recent years, capable of producing near photo-realistic renderings from any viewpoint. However, existing approaches generally rely on a large number of input images (typically 50-100) to compute camera poses and ensure view [...]
Deep 3D Geometric Reasoning for Robot Manipulation
Abstract: To solve general manipulation tasks in real-world environments, robots must be able to perceive and condition their manipulation policies on the 3D world. These agents will need to understand various common-sense spatial/geometric concepts about manipulation tasks: that local geometry can suggest potential manipulation strategies, that policies should be invariant across choice of reference frame, [...]
RI Faculty Business Meeting
Meeting for RI Faculty. Discussions include various department topics, policies, and procedures. Generally meets weekly.
Toward an ImageNet Moment for Synthetic Data
Abstract: Data, especially large-scale labeled data, has been a critical driver of progress in computer vision. However, many important tasks remain starved of high-quality data. Synthetic data from computer graphics is a promising solution to this challenge, but still remains in limited use. This talk will present our work on Infinigen, a procedural synthetic data [...]
Imitating Shortest Paths in Simulation Enables Effective Navigation and Manipulation in the Real World
Abstract: We show that imitating shortest-path planners in simulation produces Stretch RE-1 robotic agents that, given language instructions, can proficiently navigate, explore, and manipulate objects in both simulation and in the real world using only RGB sensors (no depth maps or GPS coordinates). This surprising result is enabled by our end-to-end, transformer-based, SPOC architecture, powerful [...]
Probabilistic 3D Multi-Object Cooperative Tracking for Autonomous Driving via Differentiable Multi-Sensor Kalman Filter
This talk has been postponed […]
Towards diverse zero-shot manipulation via actualizing visual plans
Abstract: In this thesis, we seek to learn a generalizable goal-conditioned policy that enables zero-shot robot manipulation — interacting with unseen objects in novel scenes without test-time adaptation. Robots that can be reliably deployed out-of-the-box in new scenarios have the potential for helping humans in everyday tasks. Not requiring any test-time training through demonstrations or [...]
Deep Learning for Sensors: Development to Deployment
Abstract: Robots rely heavily on sensing to reason about physical interactions, and recent advancements in rapid prototyping, MEMS sensing, and machine learning have led to a plethora of sensing alternatives. However, few of these sensors have gained widespread use among roboticists. This thesis proposes a framework for incorporating sensors into a robot learning paradigm, from [...]
Offline Learning for Stochastic Multi-Agent Planning in Autonomous Driving
Abstract: Fully autonomous vehicles have the potential to greatly reduce vehicular accidents and revolutionize how people travel and how we transport goods. Many of the major challenges for autonomous driving systems emerge from the numerous traffic situations that require complex interactions with other agents. For the foreseeable future, autonomous vehicles will have to share the [...]
Teruko Yata Memorial Lecture
Human-Centric Robots and How Learning Enables Generality Abstract: Humans have dreamt of robot helpers forever. What's new is that this dream is becoming real. New developments in AI, building on foundations of hardware and passive dynamics, enable vastly improved generality. Robots can step out of highly structured environments and become more human-centric: operating in human [...]
2024 Robotics Institute National Robotics Week Celebration Tours and Demos
April 12 1:00 - 4:00 pm: PUBLIC SPACE ROBOTS Open to the public TANK the roboceptionist Newell-Simon Hall 3rd floor entry area Meet Marion (Tank) LeFleur, Newell-Simon’s Roboceptionist. He’ll be glad to see you! The goal of the project is to produce a robot helpmate that is useful, exhibits social competence, and remains compelling to [...]
Creating robust deep learning models involves effectively managing nuisance variables
Abstract: Over the past decade, we have witnessed significant advances in capabilities of deep neural network models in vision and machine learning. However, issues related to bias, discrimination, and fairness in general, have received a great deal of negative attention (e.g., mistakes in surveillance and animal-human confusion of vision models). But bias in AI models [...]
Transfer Learning via Temporal Contrastive Learning Inbox
Abstract: This thesis introduces a novel transfer learning framework for deep reinforcement learning. The approach automatically combines goal-conditioned policies with temporal contrastive learning to discover meaningful sub-goals. The approach involves pre-training a goal-conditioned agent, finetuning it on the target domain, and using contrastive learning to construct a planning graph that guides the agent via sub-goals. Experiments [...]
Towards Influence-Aware Safe Human-Robot Interaction
Abstract: In recent years, we have seen through recommender systems on social media how influential (and potentially harmful) algorithms can be in our lives, sometimes creating polarization and conspiracies that lead to unsafe behavior. Now that robots are also growing more common in the real world, we must be very careful to ensure that they [...]
Learning to Manipulate beyond Imitation
Abstract: Imitation learning has been a prevalent approach for teaching robots manipulation skills but still suffers from scalability and generalizability. In this talk, I'll argue for going beyond elementary behavioral imitation from human demonstrations. Instead, I'll present two key directions: 1) Creating Manipulation Controllers from Pre-Trained Representations, and 2) Representing Video Demonstrations with Parameterized Symbolic [...]
Advanced robotics for manufacturing: challenges and opportunities
Abstract: Presenting projects with ARM Institute (including robot grinding, human-robot collaboration, and modularized manufacturing) and discussing some new opportunities in applying AI and robotics in manufacturing domain.
Improving Robot Capabilities Through Reconfigurability
Abstract: Advancements in robot capabilities are often achieved through integrating more hardware components. These hardware additions often lead to systems with high power consumption, fragility, and difficulties in control and maintenance. However, is this approach the only path to enhancing robot functionality? In this talk, I introduce the PuzzleBots, a modular multi-robot system with passive [...]
RI Faculty Business Meeting
Meeting for RI Faculty. Discussions include various department topics, policies, and procedures. Generally meets weekly.
Reduced-Gravity Flights and Field Testing for Lunar and Planetary Rovers
Abstract: As humanity returns to the Moon and is developing outposts and related infrastructure, we need to understand how robots and work machines will behave in this harsh environment. It is challenging to find representative testing environments on Earth for Lunar and planetary rovers. To investigate the effects of reduced-gravity on interactions with granular terrains, [...]
Design Principles for Robotics Systems that Support Human-Human Collaborative Learning
Abstract: Robots possess unique affordances granted by combining software and hardware. Most existing research focuses on the impact of these affordances on human-robot collaboration, but the theory of how robots can facilitate human-human collaboration is underdeveloped. Such theory would be beneficial in education. An educational device can afford collaboration in both assembly and use. This [...]
Leveraging Parallelism to Accelerate Quadratic Program Solvers for MPC
Abstract: Many problems in robotics can be formulated as quadratic programs (QPs). In particular, model-predictive control problems often involve repeatedly solving QPs at very high rates (up to kilohertz). However, while other areas of robotics like machine learning have achieved high performance by taking advantage of parallelism on modern computing hardware, state-of-the-art algorithms for solving [...]
Shedding Light on 3D Cameras
Abstract: The advent (and commoditization) of low-cost 3D cameras is revolutionizing many application domains, including robotics, autonomous navigation, human computer interfaces, and recently even consumer devices such as cell-phones. Most modern 3D cameras (e.g., LiDAR) are active; they consist of a light source that emits coded light into the scene, i.e., its intensity is modulated over [...]
Robust Incremental Distributed Collaborative Simultaneous Localization and Mapping
Abstract: Multi-robot teams show exceptional promise across applications like Search-and-Rescue, disaster-response, agriculture, forestry, and scientific exploration due to their ability to go where humans cannot, parallelize activity, operate robustly to failures, and expand capabilities beyond that of an individual robot. Collaborative Simultaneous Localization and Mapping (C-SLAM) is a fundamental capability for these multi-robot teams as [...]
Towards Equitable Representation in Text-to-Image Generation
Abstract: Accurate representation in media is known to improve the well-being of the people who consume it. There is a growing concern about the increasing use of generative AI in media as the generative image models trained on large web-crawled datasets such as LAION are known to produce images with harmful stereotypes and misrepresentations of various groups, [...]
3D Inference from Unposed Sparse View Images
Abstract: We propose UpFusion, a system that can perform novel view synthesis and infer 3D representations for generic objects given a sparse set of reference images without corresponding pose information. Current sparse-view 3D inference methods typically rely on camera poses to geometrically aggregate information from input views, but are not robust in-the-wild when such information [...]
Tightly Coupled LIDAR-Inertial Odometry
Abstract: In the age of self-driving, LIDAR and IMU represent two of the most ubiqui- tous sensors in use. Kalman Filtering and loosely coupled approaches dominate industry techniques, while current research trends towards a more tightly coupled formulation involving a joint optimization of IMU and LIDAR measurements. After two years of experience working with and [...]
A Unified Control Framework for Robust Aerial Manipulation
Abstract: Aerial robots are now widely employed in diverse applications, such as delivery, environmental monitoring, and especially aerial manipulation—the focus of this thesis. Aerial manipulation involves integrating robotic arms with drones to perform physical tasks remotely. This capability is particularly crucial for operations that are either too dangerous or inaccessible for humans, such as high-altitude [...]
In Pursuit of Open-World Mobile Manipulation
Abstract: Deploying robots in open-ended unstructured environments such as homes has been a long-standing research problem. However, robots are often studied only in closed-off lab settings, and prior mobile manipulation work is restricted to pick-move-place, which is arguably just the tip of the iceberg in this area. In this thesis, we introduce the Open-World Mobile [...]
Carnegie Mellon University
Geometric Heuristics Enhance POCUS AI for Pneumothorax
Abstract: The interpretation of Point-of-care ultrasound (POCUS) images poses a challenge due to the scarcity of high-quality labelled data for training AI models in the medical domain. To address this limitation, novel methodologies were developed to train POCUS AI models using limited data, integrating geometric heuristics derived from expert clinicians. Focused on diagnosing pneumothorax, heuristics [...]
Optimal Control and Robot Learning on Agile Safety-Critical Systems
Abstract: We present a pipeline of optimal control methods for learning an optimal control policy and locally accurate dynamics models for agile and safety-critical robots using autonomous racing as an application example. We introduce Spline-Opt, a fast offline/online optimization and planning method that can produce a reasonably good initial optimal trajectory given very little dynamics [...]
Vision Model Diagnosis and Improvement Via Large Pretrained Models
Abstract: As AI becomes increasingly pervasive in real-world applications, the deployment of machine learning models in real-world applications has underscored critical challenges in model robustness, fairness and performance. Despite significant advances, existing models often exhibit biases, fail to generalize across diverse data distributions, and struggle with unexpected input variations, leading to suboptimal or even discrimina- [...]
Beyond Robot Safety: Adaptability and Interactivity
Abstract: The deployment of autonomous robots in various areas, including transportation and human-robot collaboration, requires strong safety measures for effective interaction with the physical world. Traditional safe control algorithms work well in controlled settings but struggle to adapt to more interactive and unpredictable real-world scenarios. This thesis emphasizes the need to explore beyond traditional robot [...]
RI Faculty Business Meeting
Meeting for RI Faculty. Discussions include various department topics, policies, and procedures. Generally meets weekly.
Indoor Localization and Mapping with 4D mmWave Imaging Radar
Abstract: State estimation is a crucial component for the successful implementation of robotic systems, relying on sensors such as cameras, LiDAR, and IMUs. However, in real-world scenarios, the performance of these sensors is degraded by challenging environments, e.g. adverse weather conditions and low-light scenarios. The emerging 4D imaging radar technology is capable of providing robust perception in adverse conditions. [...]
PIE-FRIDA: Personalized Interactive Emotion-Guided Collaborative Human-Robot Art Creation
Abstract: The introduction of generative AI has brought about many improvements in the artistic world. It allows many individuals to create artwork via simple descriptive text prompts. This has, in particular, created an avenue for non-artistic individuals to express their thoughts through generated art. Our work focuses on how emotion can be added as an [...]
Where’s RobotGPT?
Abstract: The last years have seen astonishing progress in the capabilities of generative AI techniques, particularly in the areas of language and visual understanding and generation. Key to the success of these models are the use of image and text data sets of unprecedented scale along with models that are able to digest such large [...]
Carnegie Mellon University
Spectral Mapping using Simple Sensors
Abstract: Spectral mapping holds significant importance in many exploration endeavors as it facilitates a deeper comprehension of material composition within a surveyed area. While imaging spectrometers excel in recording reflectance spectra into spectral maps, their large physical footprint, substantial power requirements, and operational intricacies render them unsuitable for integration into small rovers or resource-constrained missions. [...]
Neural Field Representations of Mobile Computational Photography
Abstract: Burst imaging pipelines allow cellphones to compensate for less-than-ideal optical and sensor hardware by computationally merging multiple lower-quality images into a single high-quality output. The main challenge for these pipelines is compensating for pixel motion, estimating how to align and merge measurements across time while the user's natural hand tremor involuntarily shakes the camera. In [...]