Universal Semantic-Geometric Priors for Zero-Shot Robotic Manipulation
Abstract: Visual imitation learning has shown promising results in robotic manipulation in recent years. However, its generalization to unseen objects is often limited by the size and diversity of training data. Although more large-scale robotic datasets are available, they remain significantly smaller than image and text datasets. Additionally, scaling these datasets is time-consuming and labor-intensive, [...]
RI Faculty Business Meeting
Meeting for RI Faculty. Agenda was sent via a calendar invite.
Personalized Context-aware Multimodal Robot Feedback
Abstract: In the field of human-robot interaction (HRI), integration of robots into social settings, such as healthcare and education, is gaining traction. Robots that provide individualized support to improve human performance and subjective experience will generally be more successful in these domains. Robots should personalize their interactions, be aware of the contextual nuances surrounding their [...]
Sensorized Soft Materials Systems with Integrated Electronics and Computing
Abstract: The integration of soft and multifunctional materials in emerging technologies is becoming more widespread due to their ability to enhance or improve functionality in ways not possible using typical rigid alternatives. This trend is evident in various fields. For example, wearable technologies are increasingly designed using soft materials to improve modulus compatibility with biological [...]
Enabling Reliable Model-Based Planning with Inaccurate Models
Abstract: This thesis aims to provide a framework for combining complementary tools that enable robots to manipulate objects in the world using diverse forms of knowledge. We consider heterogeneous types of knowledge, such as physics-based models, learned dynamics models, and model-free skills learned from human demonstrations. Each form of knowledge comes with its own assumptions [...]
Unlocking Generalization for Robotics via Scale and Modularity
Abstract: How can we build generalist robot systems? Looking at fields such as vision and language, the common theme has been large scale end-to-end learning with massive, curated datasets. In robotics, on the other hand, scale alone may not be enough due to the significant multimodality of robotics tasks, lack of easily accessible data and [...]
Uncertainty and Contact with the World
Abstract: As robots move out of the lab and factory and into more challenging environments, uncertainty in the robot's state, dynamics, and contact conditions becomes a fact of life. We will never be able to perfectly predict the forces on the robot's feet as it walks through unknown mud or control the deflections of a [...]
Advancing Multimodal Sensing and Robotic Interfaces for Chronic Care
Abstract: The healthcare system prioritizes reactive care for acute illnesses, often overlooking the ongoing needs of individuals with chronic conditions that require long-term management and personalized care. Addressing this gap through technology can empower patients to better manage their conditions, enhancing independence and quality of life. Multimodal sensing, incorporating inertial, acoustic, and vision-based sensors, within [...]
RI Faculty Business Meeting
Meeting for RI Faculty. Agenda was sent via a calendar invite.
Towards Open World Robot Safety
Abstract: Robot safety is a nuanced concept. We commonly equate safety with collision-avoidance, but in complex, real-world environments (i.e., the “open world’’) it can be much more: for example, a mobile manipulator should understand when it is not confident about a requested task, that areas roped off by caution tape should never be breached, and [...]
Controllable Visual Imagination
Abstract: Generative models have empowered human creators to visualize their imaginations without artistic skills and labor. A prominent example is large-scale text-to-image generation models. However, these models often are difficult to control and do not respect 3D perspective geometry and temporal consistency of videos. In this talk, I will showcase several of our recent efforts to [...]
Low-Cost Multimodal Sensing and Dexterity for Deformable Object Manipulation
Abstract: To integrate robots seamlessly into daily life, they must be able to handle a variety of tasks in diverse environments, like cooking in restaurants or tidying up around the house. Many of the items in these environments are deformable such as fruits or bed sheets and a certain level of dexterity is necessary to [...]
Towards Spatial Intelligence for Behaviors and Environments
Abstract: We are in an era of foundation models and spatial intelligence (AR/VR). Despite significant advancements in natural language processing for reasoning, other modalities like vision lag behind, offering limited contributions: current video-language models (VLMs) struggle even with basic spatial reasoning tasks. The challenge lies in the disparate training needs of different modalities. To enhance [...]
Developing Physically Capable and Intelligent Robots
Abstract: Dr. Rizzi will provide an overview of the ongoing work at the Robotics and AI Institute (RAI Institute) and its ongoing research efforts focused on the design and control of the next generation of intelligent and capable robotics systems. The focus is on the development of systems capable of performing complex dynamic tasks at [...]
Discovering and Erasing Undesired Concepts
Abstract: The rapid growth of generative models allows an ever-increasing variety of capabilities. Yet, these models may also produce undesired content such as unsafe or misleading images, private information, or copyrighted material. In this talk, I will discuss practical methods to prevent undesired generations. First, I will show how the challenge of avoiding undesired generations [...]
Mass-Constrained Robotic Climbing on Irregular Terrain
Abstract: Climbing robots can operate in steep and unstructured environments that are inaccessible to other ground robots, with applications ranging from the inspection of artificial structures on Earth to the exploration of natural terrain features throughout the solar system. Climbing robots for planetary exploration face many challenges to deployment, including mass restrictions, irregular surface features, [...]
Towards Annotation-Free Visual-Geometric Representations and Learning for Navigation in Unstructured Environments
Abstract: Navigation in unstructured environments is a capability critical to many robotics applications such as forestry, construction, disaster response and defense. In these domains, robots have the potential to eliminate much of the dull, dirty and/or dangerous work that is currently performed by humans. Unfortunately, these environments pose a unique set of challenges for navigation [...]
RI Faculty Business Meeting
Meeting for RI Faculty. Agenda was sent via a calendar invite.
Creative Tools: In Press, In Submission, and In Progress
Abstract: It's been a while since I've had a chance to show the rest of the RI what I and my various collaborators have been working on. So this talk will be an informal and rapid-fire tour through some of the freshest results from my lab, including work that is in press, in submission, and in [...]
Stabilizing Reinforcement Learning in Differentiable Multiphysics Simulation
Abstract: Recent advances in GPU-based parallel simulation have enabled practitioners to collect large amounts of data and train complex control policies using deep reinforcement learning (RL), on commodity GPUs. However, such successes for RL in robotics have been limited to tasks sufficiently simulated by fast rigid-body dynamics. Simulation techniques for soft bodies are comparatively several [...]
Is Data All You Need?: Large Robot Action Models and Good Old Fashioned Engineering
Abstract: Enthusiasm has been skyrocketing for humanoids based on recent advances in "end-to-end" large robot action models. Initial results are promising, and several collaborative efforts are underway to collect the needed demonstration data. But is data really all you need? Although end-to-end Large Vision, Language, Action (VLA) Models have potential to generalize and reliably solve [...]
Informative Path Planning Toward Autonomous Real-World Applications
Abstract: Gathering information from the physical world is critical for applications such as scientific research, environmental monitoring, search and rescue, defense, and disaster response. Autonomous robots provide significant advantages for information gathering, particularly in situations where human access is constrained, hazardous, or impractical. By leveraging intelligent algorithms, these robots can efficiently collect data, enhancing decision-making [...]
The New Era of Video Generation
Abstract: Traditional video production is slow, expensive, and requires specialized skills. Founded by CMU alumni, HeyGen is an AI-native video platform designed to revolutionize the video creation process by making visual storytelling accessible to all. We've successfully grown to more than 20M users, and tens of millions revenue in less than one year, with recognition [...]
Robot Safety Beyond Collision-Avoidance
Abstract: It is common to equate robot safety with “collision avoidance”, but in unstructured open-world environments, a robot’s representation of safety should be much more nuanced. For example, the household manipulator should understand that pouring coffee too fast will cause the liquid to overflow or pulling a mug too quickly from a cupboard will cause [...]
Sensing the Unseen: Dexterous Tool Manipulation Through Touch and Vision
Abstract: Dexterous tool manipulation is a dance between tool motion, deformation, and force transmission choreographed by the robot's end-effector. Take for example the use of a spatula. How should the robot reason jointly over the tool’s geometry and forces imparted to the environment through vision and touch? In this talk, I will present our recent [...]
Autoregressive Models: Foundations and Open Questions
Abstract: The success of Autoregressive (AR) models in language today is so tremendous that their scope has, in turn, been largely narrowed to specific instantiations. In this talk, we will revisit the foundations of classical AR models, discussing essential concepts that may have been overlooked in modern practice. We will then introduce our recent research [...]
Carnegie Mellon University
Enabling Collaboration between Creators and Generative Models
Abstract: Generative models have made visual content creation as little effort as writing a short text description. Meanwhile, these models also spark concerns among artists, designers, and photographers about job security and data ownership. This leads to many questions: Will generative models make creators’ jobs obsolete? Should creators stop sharing their work publicly? How can creators [...]
Learning Environment Models for Mobile Robot Autonomy
Abstract: Robots are expected to execute increasingly complex tasks in increasingly complex and a priori unknown environments. A key prerequisite is the ability to understand the geometry and semantics of the environment in real time from sensor observations. This talk will present techniques for learning metric-semantic environment models from RGB and depth observations. Specific examples include [...]
Teruko Yata Memorial Lecture in Robotics
Title: Learning World Simulators from Data Abstract: Modern foundational models have achieved superhuman performance in many logic and mathematical reasoning tasks by learning to think step by step. However, their ability to understand videos, and, consequently, control embodied agents, lags behind. They often make mistakes in recognizing simple activities, and often hallucinate when generating videos. This [...]
Investigating Compositional Reasoning in Time Series Foundation Models
Abstract: Large pre-trained time series foundation models (TSFMs) have demonstrated promising zero-shot performance across a wide range of domains. However, a question remains: Do TSFMs succeed solely by memorizing training patterns, or do they possess the ability to reason? While reasoning is a topic of great interest in the study of Large Language Models (LLMs), [...]
Learning from Animal and Human Videos
Abstract: Animals and humans can learn from the billions of years of life on Earth and the evoluNon that has shaped it. If robots can borrow from that wealth of experience, they too could be enabled to learn from the experience, instead of learning through brute force trial-and-error. Learning from internet-scale videos, such as the [...]
Learning Efficient 3D Generation
Abstract: Recent advances in 3D generation have enabled the synthesis of multi-view images using large-scale pre-trained 2D diffusion models. However, these methods typically require dozens of forward passes, resulting in significant computational overhead. In this talk, we introduce Turbo3D, an ultra-fast text-to-3D system that generates high-quality Gaussian Splatting assets in under one second. Turbo3D features a [...]
Reconstructing Tree Skeletons in Agricultural Robotics: A Comparative Study of Single-View and Volumetric Methods
Abstract: This thesis investigates the problem of reconstructing tree skeletons for agricultural robotics, comparing single-view image-based (Image to 3D) and volumetric (3D to 3D) methods. Accurate 3D modeling is essential for robotic tasks like pruning and harvesting, where understanding the underlying branch structure is critical. Using a custom-generated dataset of synthetic trees, we train encoder-decoder [...]
Acoustic Neural 3D Reconstruction Under Pose Drift
Abstract: We consider the problem of optimizing neural implicit surfaces for 3D reconstruction using acoustic images collected with drifting sensor poses. The accuracy of current state-of-the-art 3D acoustic modeling algorithms is highly dependent on accurate pose estimation; small errors in sensor pose can lead to severe reconstruction artifacts. In this paper, we propose an algorithm [...]
Open-World Policy Steering for Robot Manipulation
Abstract: Generative robot policies have shown remarkable potential in learning complex, multimodal behaviors from demonstrations. However, at runtime, they still exhibit diverse failures ranging from task incompletion (e.g., toppling or dropping objects) to misaligned behaviors (e.g., placing the gripper inside of a cup of water). Instead of constantly re-training the policies with new data, we [...]
Faculty Candidate Talk: Karl Pertsch
Talk Title: Unlocking Scalable Robot Learning in the Real World Abstract: Many domains of machine learning, from language modeling to computer vision, have recently undergone a shift towards generalist models, whose broad generalization abilities are fueled by large and diverse real-world training datasets and high-capacity model architectures. In robotics, however, it has been challenging to [...]
Deep 3D Geometric Reasoning for Robot Manipulation
Abstract: To solve general manipulation tasks in real-world environments, robots must be able to perceive and condition their manipulation policies on the 3D world. These agents will need to understand various common-sense spatial/geometric concepts about manipulation tasks: that local geometry can suggest potential manipulation strategies; that changes in observation viewpoint shouldn't affect the interpretation of [...]
Faculty Candidate Talk: Aja Carter
Title: Paleorobotics: Design Principles 540 million years in the making Abstract: Bioinspiration has provided key design insights in many fields, particularly in robotics, where there has been an explosion of interest in quadrupedal robot “dogs” and bipedal humanoid robots. However, the designs prescribed by only considering living animals are a small subset of available designs; [...]
Deformation-Aware Manipulation: Compliant and Geometric Approaches for Non-Anthropomorphic Hands
Abstract: Soft robot hands offer compelling advantages for manipulation tasks, including inherent safety through material compliance, robust adaptation to uncertain object geometries, and the ability to conform to complex shapes passively. However, these same properties create significant challenges for conventional sensing and control approaches. This talk presents approaches to bridging advances in geometric learning and [...]
Faculty Candidate Talk: Carlo Sferrazza
Title: The Path to Humanoid Intelligence Abstract: Humanoid robots represent the ideal physical embodiment to assist us in the diversity of our daily tasks and human-centric environments. Driven by substantial hardware advancements, progress in artificial intelligence (AI), and a growing demand for adaptable automation, this vision appears increasingly feasible. Yet, to date, humanoid intelligence remains [...]
Integrating Safety Across the Learning-Based Perception Pipeline: From Training to Deployment
Abstract: Robots operating in safety-critical environments must reason under uncertainty and adapt to novel situations. However, recent advances in data-driven perception have made it increasingly difficult to provide formal safety guarantees, particularly when systems encounter out-of-distribution or previously unseen inputs. For such systems to be safely deployed in the real world, we need data augmentation [...]
Physical Intelligence and Cognitive Biases Toward AI
Abstract: When will robots be able to clean my house, dishes, and take care of laundry? While we source labor primarily from automated machines in factories, the penetration of physical robots in our daily lives has been slow. What are the challenges in realizing these intelligent machines capable of human level skill? Isn’t AI advanced [...]
Robotics Institute Semi-formal
Hello all Robotics Institute faculty, students, visitors and staff, You and a guest are cordially invited to attend The Robotics Institute Semi-formal
Faculty Candidate Talk: Jason Ma
Title: Internet Supervision for Robot Learning Abstract: The availability of internet-scale data has led to impressive large-scale AI models in various domains, such as vision and language. For learning robot skills, despite recent efforts in crowd-sourcing robot data, robot-specific datasets remain orders of magnitude smaller. Rather than focusing on scaling robot data, my research takes the alternative path of directly [...]
RI Seminar with Charlie Kemp
Robotics Institute Picnic
Please mark your calendars and plan to join us for the 2025 Robotics Institute Picnic! More information and RSVP e-vite to follow as we get closer to the event.