Computational Heat and Light Transport for Scene Understanding
Abstract: Thermal cameras don’t just capture heat maps—they see a mix of emitted and reflected infrared radiation. In this talk, I’ll show how we can computationally disentangle these signals to enable better interpretation of scenes from thermal data. I’ll begin with a dual-band imaging system that leverages differences in spectral emissivity to separate emitted radiation [...]
Unified Vision-Language Modeling
Abstract: Recent advances in large-scale language modeling have demonstrated significant success across various tasks, prompting efforts to extend these capabilities to other modalities, including 2D and 3D vision. However, this effort has been met with a variety of challenges due to fundamental differences in data representations, task-specific requirements, and the relative scarcity of large, high-quality [...]
SmokeSeer: 3D Gaussian Splatting for Smoke Removal and Scene Reconstruction
Abstract: In safety-critical environments such as firefighting, search and rescue, and industrial inspection, the presence of dense smoke severely hampers visual perception and degrades the performance of vision-based systems. Traditional dehazing and reconstruction methods are limited by their reliance on data-driven priors or assumptions of static, low-density smoke. We present SmokeSeer, a method that performs [...]
Advancing 3D Semantic and Geometric Reasoning
Abstract: Recent advances in foundation models have dramatically improved reasoning over language, vision, and decision-making for autonomous systems. However, extending this intelligence to embodied agents requires bridging the gap between abstract 2D understanding and grounded 3D interaction—a challenge driven by limited 3D data and the inherent complexity of spatial reasoning. This work addresses the problem [...]
Towards Scalable Layout Optimization for Large-Scale Multi-Robot Coordination Systems
Abstract: With the rapid progress in Multi-Agent Path Finding (MAPF), researchers have studied how MAPF algorithms can be deployed to coordinate hundreds of robots in large automated warehouses. While most works try to improve the throughput of such warehouses by developing better MAPF algorithms, we focus on improving the throughput by optimizing the warehouse layout. [...]
Learning Universal Humanoid Control
Abstract: Since infancy, humans acquire motor skills, behavioral priors, and objectives by learning from their caregivers. Similarly, as we create humanoids in our own image, we aspire for them to learn from us and develop universal physical and cognitive capabilities that are comparable to, or even surpass, our own. In this thesis, we explore how [...]
Enhancing the Physical Capabilities of Aerial Robots: From Inspection to Manipulation
Abstract: Uncrewed Aerial Vehicles (UAVs) are increasingly used for high-altitude tasks, many of which require not only perception but also active interaction with the environment. This has led to growing interest in aerial manipulation—combining aerial mobility with manipulation capabilities. In this talk, we explore how to move toward general aerial manipulation: enabling a single system [...]
Flexible Perception for High-Performance Robot Navigation
Abstract: Real-world autonomy requires perception systems that deliver rich, accurate information given the task and environment. However, as robots scale to diverse and rapidly evolving settings, maintaining this level of performance becomes increasingly brittle and labor-intensive, requiring significant human engineering and retraining for even small changes in environment and problem definition. To overcome this bottleneck, [...]
Generating a Physical World
Abstract: Generating an interactive, enlivened, and physical world enables a wide range of applications in entertainment, embodied AI, education, and creative designs. Recent image/video models have shown promise in producing realistic visuals, yet they operate purely at the pixel level and lack underlying physical grounding, leading to failures in physical fidelity and user interactivity. In [...]
Learning Bayesian Experimental Design Policies Efficiently and Robustly
Abstract: Bayesian Experimental Design (BED) provides a principled framework for sequential data-collection under uncertainty, and is used in a wide set of domains such as clinical trials, ecological monitoring, and hyperparameter optimization. Despite its wide applicability, BED methods remain challenging to deploy in practice due to their significant computational demands. This thesis addresses these computational [...]
Unlocking Robust Spatial Perception: Resilient State Estimation and Mapping for Long-term Autonomy
Abstract: How can we enable robots to perceive, adapt, and understand their surroundings like humans—in real-time and under uncertainty? Just as humans rely on vision to navigate complex environments, robots need robust and intelligent perception systems—“eyes” that can endure sensor degradation, adapt to changing conditions, and recover from failure. However, today’s visual systems are fragile—easily [...]