PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Generative Robotics: Self-Supervised Learning for Human-Robot Collaborative Creation

NSH 4305

Abstract: While Generative AI has shown breakthroughs in recent years in generating new digital contents such as images or 3D models from high-level goal inputs like text, Robotics technologies have not, instead focusing on low-level goal inputs. We propose Generative Robotics, as a new field of robotics which combines the high-level goal input abilities of [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

3D Video Models through Point Tracking, Reconstructing and Forecasting

NSH 3305

Abstract: 3D scene understanding from 2D video is essential for enabling advanced applications such as autonomous driving, robotics, virtual reality, and augmented reality. These fields rely on accurate 3D spatial awareness and dynamic interaction modeling to navigate complex environments, manipulate objects, and provide immersive experiences. Unlike 2D, 3D training data is much less abundant, which [...]

RI Seminar
Nikolai Matni
Assistant Professor
Department of Electrical and Systems Engineering, University of Pennsylvania

What Makes Learning to Control Easy or Hard?

1403 Tepper School Building

Abstract: Designing autonomous systems that are simultaneously high-performing, adaptive, and provably safe remains an open problem. In this talk, we will argue that in order to meet this goal, new theoretical and algorithmic tools are needed that blend the stability, robustness, and safety guarantees of robust control with the flexibility, adaptability, and performance of machine [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Towards a Robot Generalist through In-Context Learning and Abstractions

NSH 1305

Abstract: The goal of this thesis is to discover AI processes that enhance cross-domain and cross-task generalization in intelligent robot agents. Unlike the dominant approach in contemporary robot learning, which pursues generalization primarily through scaling laws (increasing data and model size), we focus on identifying the best abstractions and representations in both perception and policy [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Vision-based Human Motion Modeling and Analysis

NSH 4305

Abstract: Modern computer vision has achieved remarkable success in tasks such as detecting, segmenting, and estimating the pose of humans in images and videos, reaching or even surpassing human-level performance. However, they still face significant challenges in predicting and analyzing future human motion. This thesis explores how vision-based solutions can enhance the fidelity and accuracy [...]

VASC Seminar
Bailey Miller
PhD Candidate
Carnegie Mellon University

Stochastic Graphics Primitives

3305 Newell-Simon Hall

Abstract: For decades computer graphics has successfully leveraged stochasticity to enable both expressive volumetric representations of participating media like clouds and efficient Monte Carlo rendering of large scale, complex scenes. In this talk, we’ll explore how these complementary forms of stochasticity (representational and algorithmic) may be applied more generally across computer graphics and vision. In [...]

PhD Speaking Qualifier
PhD Student
Robotics Institute,
Carnegie Mellon University

Recent Progress in Graph-Search Methods for Multi-Robot-Arm Motion Planning

NSH 4305

Abstract: An exciting frontier in robotic manipulation is the use of multiple arms at once. However, planning concurrent motions is a challenging task using current methods. A major obstacle is the high-dimensional state space of this planning problem, which renders many traditional motion planning algorithms impractical. This opens the door for alternatives to the common [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Physical Process-Informed Mapping for Robotic Exploration

NSH 4305

Abstract: Mobile robots used for information gathering tasks rely on dense, predictive mapping of large-scale regions to determine where to take measurements. Current approaches to mapping commonly rely on Gaussian process regression to spatially correlate data, extrapolate from sparse samples, and estimate uncertainty. However, these approaches do not incorporate meaningful information about physical processes that [...]

Faculty Events

RI Faculty Business Meeting

Newell-Simon Hall 4305

Meeting for RI Faculty. Agenda was sent via a calendar invite.

RI Seminar
Robert Katzschmann
Assistant Professor
Institute for Robotics and Intelligent Systems, ETH Zurich

Can Robots Based on Musculoskeletal Designs Better Interact With the World?

1403 Tepper School Building

Abstract: Living robots represent a new frontier in engineering materials for robotic systems, incorporating biological living cells and synthetic materials into their design. These bio-hybrid robots are dynamic and intelligent, potentially harnessing living matter’s capabilities, such as growth, regeneration, morphing, biodegradation, and environmental adaptation. Such attributes position bio-hybrid devices as a transformative force in robotics [...]

RI Seminar
Allison Okamura
Richard W. Weiland Professor of Engineering
Department of Mechanical Engineering, Stanford University

Soft Wearable Haptic Devices for Ubiquitous Communication

1403 Tepper School Building

Abstract: Haptic devices allow touch-based information transfer between humans and intelligent systems, enabling communication in a salient but private manner that frees other sensory channels. For such devices to become ubiquitous, their physical and computational aspects must be intuitive and unobtrusive. The amount of information that can be transmitted through touch is limited in large [...]

VASC Seminar
Noah Snavely
Professor & Research Scientist
Cornell Tech & Google DeepMind

Reconstructing Everything

3305 Newell-Simon Hall

Abstract: The presentation will be about a long-running, perhaps quixotic effort to reconstruct all of the world's structures in 3D from Internet photos, why this is challenging, and why this effort might be useful in the era of generative AI.   Bio: Noah Snavely is a Professor in the Computer Science Department at Cornell University [...]

Field Robotics Center Seminar
Srdjan Acimovic
Assistant Professor
School of Plant and Environmental Sciences, Virginia Tech

Using Robotics, Imaging and AI to Tackle Apple Fruit Production: Crop Harvest and Fire Blight Disease, The Two Major Bottlenecks for U.S. Apple Producers

CIC CIC Buuilding Conference Room 1, LL Level

Abstract Temperate tree fruit production is a significant agricultural sector in the United States, encompassing a variety of fruits like apples, pears, cherries, peaches and plums. The U.S. is the second-largest producer of apples in the world, after China. Annual U.S. production is 10 - 11 billion pounds of apple. However, apple production is complicated [...]

PhD Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

Moving Lights and Cameras for Better 3D Perception of Indoor Scenes

GHC 6501

Abstract: Decades of research on computer vision have highlighted the importance of active sensing -- where an agent controls the parameters of the sensors to improve perception. Research on active perception in the context of robotic manipulation has demonstrated many novel and robust sensing strategies involving a multitude of sensors like RGB and RGBD cameras [...]

RI Seminar
Assistant Professor
Robotics Institute,
Carnegie Mellon University

Building Generalist Robots with Agility via Learning and Control: Humanoids and Beyond

1403 Tepper School Building

Abstract: Recent breathtaking advances in AI and robotics have brought us closer to building general-purpose robots in the real world, e.g., humanoids capable of performing a wide range of human tasks in complex environments. Two key challenges in realizing such general-purpose robots are: (1) achieving "breadth" in task/environment diversity, i.e., the generalist aspect, and (2) [...]

VASC Seminar
Christian Richardt
Research Scientist Lead
Meta Reality Labs Research

High-Fidelity Neural Radiance Fields

3305 Newell-Simon Hall

Abstract: I will present three recent projects that focus on high-fidelity neural radiance fields for walkable VR spaces: VR-NeRF (SIGGRAPH Asia 2023) is an end-to-end system for the high-fidelity capture, model reconstruction, and real-time rendering of walkable spaces in virtual reality using neural radiance fields. To this end, we designed and built a custom multi-camera rig to [...]

VASC Seminar
Saining Xie
Assistant Professor
Courant Institute of Mathematical Sciences, New York University

Building Scalable Visual Intelligence: From Represention to Understanding and Generation

3305 Newell-Simon Hall

Abstract: In this talk, we will dive into our recent work on vision-centric generative AI, focusing on how it helps with understanding and creating visual content like images and videos. We'll cover the latest advances, including multimodal large language models for visual understanding and diffusion transformers for visual generation. We'll explore how these two areas [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Learning to create 3D content

NSH 4305

Abstract: With the popularity of Virtual Reality (VR), Augmented Reality (AR), and other 3D applications, developing methods that let everyday users capture and create their own 3D content has become increasingly essential. Current 3D creation pipelines often require either tedious manual effort or specialized setups with densely captured views. Additionally, many resulting 3D models are [...]

PhD Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

Trustworthy Learning using Uncertain Interpretation of Data

GHC 6501

Abstract: Motivated by the potential of Artificial Intelligence (AI) in high-cost and safety-critical applications, and recently also by the increasing presence of AI in our everyday lives, Trustworthy AI has grown in prominence as a broad area of research encompassing topics such as interpretability, robustness, verifiable safety, fairness, privacy, accountability, and more. This has created [...]

RI Seminar
Anirudha Majumdar
Associate Professor
Mechanical and Aerospace Engineering, Princeton University

Robots That Know When They Don’t Know

1403 Tepper School Building

Abstract: Foundation models from machine learning have enabled rapid advances in perception, planning, and natural language understanding for robots. However, current systems lack any rigorous assurances when required to generalize to novel scenarios. For example, perception systems can fail to identify or localize unfamiliar objects, and large language model (LLM)-based planners can hallucinate outputs that [...]

VASC Seminar
Qitao Zhao
Master's Student
Computer Vision, Carnegie Mellon University

Sparse-view Pose Estimation and Reconstruction via Analysis by Generative Synthesis

3305 Newell-Simon Hall

Abstract:  This talk will present our approach for reconstructing objects from sparse-view images captured in unconstrained environments. In the absence of ground-truth camera poses, we will demonstrate how to utilize estimates from off-the-shelf systems and address two key challenges: refining noisy camera poses in sparse views and effectively handling outlier poses.   Bio:  Qitao is a second-year [...]

VASC Seminar
Vimal Mollyn
PhD Student
Human Computer Interaction Institute, Carnegie Mellon University

EgoTouch: On-Body Touch Input Using AR/VR Headset Cameras

3305 Newell-Simon Hall

Abstract:  In augmented and virtual reality (AR/VR) experiences, a user’s arms and hands can provide a convenient and tactile surface for touch input. Prior work has shown on-body input to have significant speed, accuracy, and ergonomic benefits over in-air interfaces, which are common today. In this work, we demonstrate high accuracy, bare hands (i.e., no special [...]

VASC Seminar
Hyunsung Cho
Ph.D. Student
Human-Computer Interaction Institute (HCII) , Carnegie Mellon University

Auptimize: Optimal Placement of Spatial Audio Cues for Extended Reality

3305 Newell-Simon Hall

Abstract:  Spatial audio in Extended Reality (XR) provides users with better awareness of where virtual elements are placed, and efficiently guides them to events such as notifications, system alerts from different windows, or approaching avatars. Humans, however, are inaccurate in localizing sound cues, especially with multiple sources due to limitations in human auditory perception such as [...]

MSR Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

VoxDet: Voxel Learning for Novel Instance Detection

NSH 3305

Abstract: Detecting unseen instances based on multi-view templates is a challenging problem due to its open-world nature. Traditional methodologies, which primarily rely on 2D representations and matching techniques, are often inadequate in handling pose variations and occlusions. To solve this, we introduce VoxDet, a pioneer 3D geometry-aware framework that fully utilizes the strong 3D voxel [...]

MSR Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

Voxel Learning for Novel Instance Detection

Newell-Simon Hall 3305

Abstract: Detecting unseen instances based on multi-view templates is a challenging problem due to its open-world nature. Traditional methodologies, which primarily rely on 2D representations and matching techniques, are often inadequate in handling pose variations and occlusions. To solve this, we introduce VoxDet, a pioneer 3D geometry-aware framework that fully utilizes the strong 3D voxel [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Sensorimotor-Aligned Design for Pareto-Efficient Haptic Immersion in Extended Reality

GHC 4405

Abstract: A new category of computing devices is emerging: augmented and virtual reality headsets, collectively referred to as extended reality (XR). These devices can alter, augment, or even replace our reality. While these headsets have made impressive strides in audio-visual immersion over the past half-century, XR interactions remain almost completely absent of appropriately expressive tactile [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Evaluating and Improving Vision-Language Models Beyond Scaling Laws

GHC 6501

Abstract: In this talk, we present our work on advancing Vision-Language Models (VLMs) beyond scaling laws through improved evaluation and (post-)training strategies. Our contributions include VQAScore, a state-of-the-art alignment metric for text-to-visual generation. We show how VQAScore improves visual generation under real-world user prompts in GenAI-Bench. Additionally, we explore training methods that leverage the language [...]

PhD Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

Whisker-Inspired Sensors for Unstructured Environments

NSH 4305

Abstract: Robots lack the perception abilities of animals, which is one reason they can not achieve complex control in outdoor unstructured environments with the same ease as animals. One cause of the perception gap is the constraints researchers place on the environments in which they test new sensors so algorithms can correctly interpret data from [...]

PhD Speaking Qualifier
PhD Student
Robotics Institute,
Carnegie Mellon University

Strategy and Skill Learning for Physics-based Table Tennis Animation

Abstract: Recent advancements in physics-based character animation leverage deep learning to generate agile and natural motion, enabling characters to execute movements such as backflips, boxing, and tennis. However, reproducing the selection and use of diverse motor skills in dynamic environments to solve complex tasks, as humans do, still remains a challenge. We present a strategy [...]

RI Seminar
Nils Napp
Assistant Professor
Electrical and Computer Engineering, Cornell University

Abstraction Barriers for Embodied Algorithms

1403 Tepper School Building

Abstract: Designing robotic systems to reliably modify their environment typically requires expert engineers and several design iterations. This talk will cover abstraction barriers that can be used to make the process of building such systems easier and the results more predictable. By focusing on approximate mathematical representations that model the process dynamics, these representations can [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Getting Optimization layers to play well with Deep Networks: Numerical methods and Architectures

NSH 4305

Abstract: Many real-world challenges, from robotic control to resource management, can be effectively formulated as optimization problems. Recent advancements have focused on incorporating these optimization problems as layers within deep learning pipelines, enabling the explicit inclusion of auxiliary constraints or cost functions, which is crucial for applications such as enforcing physical laws, ensuring safety constraints, [...]

Faculty Events

RI Faculty Business Meeting

Newell-Simon Hall 4305

Meeting for RI Faculty. Agenda was sent via a calendar invite.

RI Seminar
Axel Krieger
Associate Professor
Department of Mechanical Engineering, Johns Hopkins Whiting School of Engineering

Autonomous Robotic Surgery: Science Fiction or Reality?

1403 Tepper School Building

Abstract:  Robotic assisted surgery (RAS) systems incorporate highly dexterous tools, hand tremor filtering, and motion scaling to enable a minimally invasive surgical approach, reducing collateral damage and patient recovery times. However, current state-of-the-art telerobotic surgery requires a surgeon operating every motion of the robot, resulting in long procedure times and inconsistent results. The advantages of [...]

VASC Seminar
Srinath Sridhar
Assistant Professor
Computer Science, Brown University

Generative Modelling for 3D Multimodal Understanding of Human Physical Interactions

3305 Newell-Simon Hall

Abstract: Generative modelling has been extremely successful in synthesizing text, images, and videos. Can the same machinery also help us better understand how to physically interact with the multimodal 3D world? In this talk, I will introduce some of my group's work in answering this question. I will first discuss how we can enable 2D [...]

Field Robotics Center Seminar
Senior Field Robotics Specialist
Robotics Institute,
Carnegie Mellon University

A retrospective, 40 Years of Field Robotics

CIC CIC Buuilding Conference Room 1, LL Level

Abstract: Chuck has been building and deploying robots in the field for the past 40 years.  In this retrospective he will touch on the robots, people and experiences that have been part of the journey.  From the early days in the 1980s with the Three Mile Island nuclear robots and the first outdoor autonomy robots [...]

MSR Thesis Defense
MSR Student / Teaching Assistant
Robotics Institute,
Carnegie Mellon University

Efficient Quadruped Mobility: Harnessing a Generalist Policy for Streamlined Planning

GHC 4405

Abstract: Navigating quadruped robots through complex, unstructured environments over long horizons remains a significant challenge in robotics. Traditional planning methods offer guarantees such as optimality and long-horizon reasoning, while learning-based methods, particularly those involving deep reinforcement learning (DRL), provide robustness and generalization. In this thesis, we present S3D-OWNS (Skilled 3D-Optimal Waypoint Navigation System), a novel [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Data Attribution for Text-to-Image Models

NSH 4305

Abstract: Large text-to-image models learn from training data to synthesize "novel" images, but how the models use the training data remains a mystery. The problem of data attribution is to identify which training images are influential for generating a given output. Specifically, removing influential images and retraining the model would prevent it from reproducing that [...]

PhD Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

Differentiable Convex Modeling for Robotic Planning and Control

NSH 4305

Abstract: Robotic simulation, planning, estimation, and control, have all been built on top of numerical optimization. In this same time, modern convex optimization has matured into a robust technology delivering globally optimal solutions in polynomial time. With advances in differentiable optimization and custom solvers capable of producing smooth derivatives, convex modeling has become fast, reliable, [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Knowledge and Data Dependence in Decision-Making

NSH 3001

Abstract: This thesis explores diverse decision-making strategies for autonomous agents by examining knowledge-dependent and data-dependent approaches in stationary and dynamic data environments. We address five core research problems across three thematic areas: knowledge-dependent, stationary data-dependent, and evolving data-dependent decision-making. We first investigate knowledge-driven decision-making within robotic swarms, characterizing vulnerabilities in systems governed by consistent rule-following [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Communication Efficient and Differentially Private Optimization

NSH 4305

Abstract: In recent years, the integration of communication efficiency and differential privacy in distributed optimization has gained significant attention, motivated by large-scale applications such as Federated Learning (FL), where both data privacy and efficient communication are critical. This thesis explores the development of novel techniques to address these challenges, with a focus on distributed mean [...]

PhD Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

Towards a Universal Data Engine for Robotics and Beyond

GHC 4405

Abstract: Robotics researchers have been attempting to extend data-driven breakthroughs in fields like computer vision and language processing into robot learning. However, unlike vision or language domains where massive amounts of data is readily available on the internet, training robotic policies relies on physical and interactive data collected via interacting with the physical world -- [...]

RI Seminar
Assistant Professor
Robotics Institute,
Carnegie Mellon University

Learning for Dynamic Robot Manipulation of Deformable and Transparent Objects

1403 Tepper School Building

Abstract: Dynamics, softness, deformability, and difficult-to-detect objects will be critical for new domains in robotic manipulation. But there are complications--including unmodelled dynamic effects, infinite-dimensional state spaces of deformable objects, and missing features from perception. This talk explores learning methods based on multi-view sensing, acoustics, physics-based regularizations, and Koopman operators and proposes a novel multi-finger soft [...]

PhD Speaking Qualifier
PhD Student
Robotics Institute,
Carnegie Mellon University

HaptiClay: An Interactive Haptic Interface for Gestured Concretization of Polynomial Functions

NSH 4305

Abstract: In this work we present HaptiClay, a low-cost kinesthetic haptic interface that elevates the understanding of mathematics language by providing embodied non-verbal representations of math concepts. Our interface integrates four key components: a haptic device, a high-level simulation that communicates with a low-level controller for force and position updates, a low-level controller that executes [...]