PhD Speaking Qualifier
PhD Student
Robotics Institute,
Carnegie Mellon University

Computational Heat and Light Transport for Scene Understanding

GHC 4101

Abstract: Thermal cameras don’t just capture heat maps—they see a mix of emitted and reflected infrared radiation. In this talk, I’ll show how we can computationally disentangle these signals to enable better interpretation of scenes from thermal data. I’ll begin with a dual-band imaging system that leverages differences in spectral emissivity to separate emitted radiation [...]

MSR Thesis Defense
MSR Students
Robotics Institute,
Carnegie Mellon University

Unified Vision-Language Modeling

GHC 4405

Abstract: Recent advances in large-scale language modeling have demonstrated significant success across various tasks, prompting efforts to extend these capabilities to other modalities, including 2D and 3D vision. However, this effort has been met with a variety of challenges due to fundamental differences in data representations, task-specific requirements, and the relative scarcity of large, high-quality [...]

MSR Thesis Defense
MSR Student
Robotics Institute,
Carnegie Mellon University

SmokeSeer: 3D Gaussian Splatting for Smoke Removal and Scene Reconstruction

GHC 8102

Abstract: In safety-critical environments such as firefighting, search and rescue, and industrial inspection, the presence of dense smoke severely hampers visual perception and degrades the performance of vision-based systems. Traditional dehazing and reconstruction methods are limited by their reliance on data-driven priors or assumptions of static, low-density smoke. We present SmokeSeer, a method that performs [...]

MSR Thesis Defense
MSR Student
Robotics Institute,
Carnegie Mellon University

Advancing 3D Semantic and Geometric Reasoning

GHC 6115

Abstract: Recent advances in foundation models have dramatically improved reasoning over language, vision, and decision-making for autonomous systems. However, extending this intelligence to embodied agents requires bridging the gap between abstract 2D understanding and grounded 3D interaction—a challenge driven by limited 3D data and the inherent complexity of spatial reasoning. This work addresses the problem [...]

PhD Speaking Qualifier
PhD Student
Robotics Institute,
Carnegie Mellon University

Towards Scalable Layout Optimization for Large-Scale Multi-Robot Coordination Systems

GHC 6501

Abstract: With the rapid progress in Multi-Agent Path Finding (MAPF), researchers have studied how MAPF algorithms can be deployed to coordinate hundreds of robots in large automated warehouses. While most works try to improve the throughput of such warehouses by developing better MAPF algorithms, we focus on improving the throughput by optimizing the warehouse layout. [...]

PhD Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

Learning Universal Humanoid Control

WEH 5320

Abstract: Since infancy, humans acquire motor skills, behavioral priors, and objectives by learning from their caregivers. Similarly, as we create humanoids in our own image, we aspire for them to learn from us and develop universal physical and cognitive capabilities that are comparable to, or even surpass, our own. In this thesis, we explore how [...]

MSR Thesis Defense
Teaching Assistant / MSR Student
Robotics Institute,
Carnegie Mellon University

Enhancing the Physical Capabilities of Aerial Robots: From Inspection to Manipulation

GHC 6501

Abstract: Uncrewed Aerial Vehicles (UAVs) are increasingly used for high-altitude tasks, many of which require not only perception but also active interaction with the environment. This has led to growing interest in aerial manipulation—combining aerial mobility with manipulation capabilities. In this talk, we explore how to move toward general aerial manipulation: enabling a single system [...]

PhD Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

Flexible Perception for High-Performance Robot Navigation

NSH 3305

Abstract: Real-world autonomy requires perception systems that deliver rich, accurate information given the task and environment. However, as robots scale to diverse and rapidly evolving settings, maintaining this level of performance becomes increasingly brittle and labor-intensive, requiring significant human engineering and retraining for even small changes in environment and problem definition. To overcome this bottleneck, [...]

VASC Seminar
Hong-Xing “Koven” Yu
PhD candidate
Computer Science Department , Stanford University

Generating a Physical World

3305 Newell-Simon Hall

Abstract:  Generating an interactive, enlivened, and physical world enables a wide range of applications in entertainment, embodied AI, education, and creative designs. Recent image/video models have shown promise in producing realistic visuals, yet they operate purely at the pixel level and lack underlying physical grounding, leading to failures in physical fidelity and user interactivity. In [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Learning Bayesian Experimental Design Policies Efficiently and Robustly

NSH 3305

Abstract: Bayesian Experimental Design (BED) provides a principled framework for sequential data-collection under uncertainty, and is used in a wide set of domains such as clinical trials, ecological monitoring, and hyperparameter optimization. Despite its wide applicability, BED methods remain challenging to deploy in practice due to their significant computational demands. This thesis addresses these computational [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Unlocking Robust Spatial Perception: Resilient State Estimation and Mapping for Long-term Autonomy

NSH 3305

Abstract: How can we enable robots to perceive, adapt, and understand their surroundings like humans—in real-time and under uncertainty? Just as humans rely on vision to navigate complex environments, robots need robust and intelligent perception systems—“eyes” that can endure sensor degradation, adapt to changing conditions, and recover from failure. However, today’s visual systems are fragile—easily [...]

VASC Seminar
David Chu
VP of Spatial Computing and XR
NVIDIA

When Spatial Computing meets Accelerated Computing

3305 Newell-Simon Hall

Abstract:  NVIDIA has been pioneering Accelerated Computing for the past three decades, driving innovations that have transformed society. Among all personal computing mediums, Spatial Computing and Extended Reality (XR) stand out as some of the most promising beneficiaries of accelerated computing. In this talk, we will explore the latest developments and trends in the XR ecosystem, [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

From Pixels to Physical Intelligence: Semantic 3D Data Generation at Internet Scale

GHC 4405

Abstract: Modern AI won’t achieve physical intelligence until it can extract rich, semantic spatial knowledge from the wild ocean of internet video—not just curated motion-capture datasets or expensive 3D scans. This thesis proposes a self-bootstrapping pipeline for converting raw pixels into large-scale 3D and 4D spatial understanding. It begins with multi-view bootstrapping: using just two [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Self supervised perception for Tactile Dexterity

GHC 4405

Abstract: Humans are incredibly dexterous. We interact with and manipulate tools effortlessly, leveraging touch without giving it a second thought. Yet, replicating this level of dexterity in robots, is a major challenge. While the robotics community, recognizing the importance of touch in fine manipulation, has developed a wide variety of tactile sensors, how best to [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Differentiable Probabilistic Inference and Rendering for Multimodal Robotic Perception

NSH 4305

Abstract: Robots are increasingly deployed to automate tasks that are dangerous or mundane for humans such as search and rescue, mapping, and inspection in difficult environments. They rely on their perception stack, typically composed of complementary sensing modalities, to estimate their own state and the state of the environment to enable informed decision-making. This thesis [...]

VASC Seminar
Mike Shou
Assistant Professor
National University of Singapore

Video intelligence in the era of multimodal

3305 Newell-Simon Hall

Abstract: The past few years have witnessed great success in video intelligence, as supercharged by multimodal models. In this talk, I will start with a brief sharing of our efforts, in building video-language models for understanding and diffusion models for video generation. Yet, video understanding and generation have always been two separate research pillars, despite [...]

RI Event

Robotics Institute Picnic

Please mark your calendars and plan to join us for the 2025 Robotics Institute Picnic! More information and RSVP e-vite to follow as we get closer to the event.