PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

3D Video Models through Point Tracking, Reconstructing and Forecasting

NSH 3305

Abstract: 3D scene understanding from 2D video is essential for enabling advanced applications such as autonomous driving, robotics, virtual reality, and augmented reality. These fields rely on accurate 3D spatial awareness and dynamic interaction modeling to navigate complex environments, manipulate objects, and provide immersive experiences. Unlike 2D, 3D training data is much less abundant, which [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Towards a Robot Generalist through In-Context Learning and Abstractions

NSH 1305

Abstract: The goal of this thesis is to discover AI processes that enhance cross-domain and cross-task generalization in intelligent robot agents. Unlike the dominant approach in contemporary robot learning, which pursues generalization primarily through scaling laws (increasing data and model size), we focus on identifying the best abstractions and representations in both perception and policy [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Vision-based Human Motion Modeling and Analysis

NSH 4305

Abstract: Modern computer vision has achieved remarkable success in tasks such as detecting, segmenting, and estimating the pose of humans in images and videos, reaching or even surpassing human-level performance. However, they still face significant challenges in predicting and analyzing future human motion. This thesis explores how vision-based solutions can enhance the fidelity and accuracy [...]

PhD Speaking Qualifier
PhD Student
Robotics Institute,
Carnegie Mellon University

Recent Progress in Graph-Search Methods for Multi-Robot-Arm Motion Planning

NSH 4305

Abstract: An exciting frontier in robotic manipulation is the use of multiple arms at once. However, planning concurrent motions is a challenging task using current methods. A major obstacle is the high-dimensional state space of this planning problem, which renders many traditional motion planning algorithms impractical. This opens the door for alternatives to the common [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Physical Process-Informed Mapping for Robotic Exploration

NSH 4305

Abstract: Mobile robots used for information gathering tasks rely on dense, predictive mapping of large-scale regions to determine where to take measurements. Current approaches to mapping commonly rely on Gaussian process regression to spatially correlate data, extrapolate from sparse samples, and estimate uncertainty. However, these approaches do not incorporate meaningful information about physical processes that [...]

PhD Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

Moving Lights and Cameras for Better 3D Perception of Indoor Scenes

GHC 6501

Abstract: Decades of research on computer vision have highlighted the importance of active sensing -- where an agent controls the parameters of the sensors to improve perception. Research on active perception in the context of robotic manipulation has demonstrated many novel and robust sensing strategies involving a multitude of sensors like RGB and RGBD cameras [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Learning to create 3D content

NSH 4305

Abstract: With the popularity of Virtual Reality (VR), Augmented Reality (AR), and other 3D applications, developing methods that let everyday users capture and create their own 3D content has become increasingly essential. Current 3D creation pipelines often require either tedious manual effort or specialized setups with densely captured views. Additionally, many resulting 3D models are [...]

PhD Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

Trustworthy Learning using Uncertain Interpretation of Data

GHC 6501

Abstract: Motivated by the potential of Artificial Intelligence (AI) in high-cost and safety-critical applications, and recently also by the increasing presence of AI in our everyday lives, Trustworthy AI has grown in prominence as a broad area of research encompassing topics such as interpretability, robustness, verifiable safety, fairness, privacy, accountability, and more. This has created [...]

MSR Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

VoxDet: Voxel Learning for Novel Instance Detection

NSH 3305

Abstract: Detecting unseen instances based on multi-view templates is a challenging problem due to its open-world nature. Traditional methodologies, which primarily rely on 2D representations and matching techniques, are often inadequate in handling pose variations and occlusions. To solve this, we introduce VoxDet, a pioneer 3D geometry-aware framework that fully utilizes the strong 3D voxel [...]

MSR Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

Voxel Learning for Novel Instance Detection

Newell-Simon Hall 3305

Abstract: Detecting unseen instances based on multi-view templates is a challenging problem due to its open-world nature. Traditional methodologies, which primarily rely on 2D representations and matching techniques, are often inadequate in handling pose variations and occlusions. To solve this, we introduce VoxDet, a pioneer 3D geometry-aware framework that fully utilizes the strong 3D voxel [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Sensorimotor-Aligned Design for Pareto-Efficient Haptic Immersion in Extended Reality

GHC 4405

Abstract: A new category of computing devices is emerging: augmented and virtual reality headsets, collectively referred to as extended reality (XR). These devices can alter, augment, or even replace our reality. While these headsets have made impressive strides in audio-visual immersion over the past half-century, XR interactions remain almost completely absent of appropriately expressive tactile [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Evaluating and Improving Vision-Language Models Beyond Scaling Laws

GHC 6501

Abstract: In this talk, we present our work on advancing Vision-Language Models (VLMs) beyond scaling laws through improved evaluation and (post-)training strategies. Our contributions include VQAScore, a state-of-the-art alignment metric for text-to-visual generation. We show how VQAScore improves visual generation under real-world user prompts in GenAI-Bench. Additionally, we explore training methods that leverage the language [...]

PhD Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

Whisker-Inspired Sensors for Unstructured Environments

NSH 4305

Abstract: Robots lack the perception abilities of animals, which is one reason they can not achieve complex control in outdoor unstructured environments with the same ease as animals. One cause of the perception gap is the constraints researchers place on the environments in which they test new sensors so algorithms can correctly interpret data from [...]

PhD Speaking Qualifier
PhD Student
Robotics Institute,
Carnegie Mellon University

Strategy and Skill Learning for Physics-based Table Tennis Animation

Abstract: Recent advancements in physics-based character animation leverage deep learning to generate agile and natural motion, enabling characters to execute movements such as backflips, boxing, and tennis. However, reproducing the selection and use of diverse motor skills in dynamic environments to solve complex tasks, as humans do, still remains a challenge. We present a strategy [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Getting Optimization layers to play well with Deep Networks: Numerical methods and Architectures

NSH 4305

Abstract: Many real-world challenges, from robotic control to resource management, can be effectively formulated as optimization problems. Recent advancements have focused on incorporating these optimization problems as layers within deep learning pipelines, enabling the explicit inclusion of auxiliary constraints or cost functions, which is crucial for applications such as enforcing physical laws, ensuring safety constraints, [...]

MSR Thesis Defense
MSR Student / Teaching Assistant
Robotics Institute,
Carnegie Mellon University

Efficient Quadruped Mobility: Harnessing a Generalist Policy for Streamlined Planning

GHC 4405

Abstract: Navigating quadruped robots through complex, unstructured environments over long horizons remains a significant challenge in robotics. Traditional planning methods offer guarantees such as optimality and long-horizon reasoning, while learning-based methods, particularly those involving deep reinforcement learning (DRL), provide robustness and generalization. In this thesis, we present S3D-OWNS (Skilled 3D-Optimal Waypoint Navigation System), a novel [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Data Attribution for Text-to-Image Models

NSH 4305

Abstract: Large text-to-image models learn from training data to synthesize "novel" images, but how the models use the training data remains a mystery. The problem of data attribution is to identify which training images are influential for generating a given output. Specifically, removing influential images and retraining the model would prevent it from reproducing that [...]

PhD Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

Differentiable Convex Modeling for Robotic Planning and Control

NSH 4305

Abstract: Robotic simulation, planning, estimation, and control, have all been built on top of numerical optimization. In this same time, modern convex optimization has matured into a robust technology delivering globally optimal solutions in polynomial time. With advances in differentiable optimization and custom solvers capable of producing smooth derivatives, convex modeling has become fast, reliable, [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Knowledge and Data Dependence in Decision-Making

NSH 3001

Abstract: This thesis explores diverse decision-making strategies for autonomous agents by examining knowledge-dependent and data-dependent approaches in stationary and dynamic data environments. We address five core research problems across three thematic areas: knowledge-dependent, stationary data-dependent, and evolving data-dependent decision-making. We first investigate knowledge-driven decision-making within robotic swarms, characterizing vulnerabilities in systems governed by consistent rule-following [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Communication Efficient and Differentially Private Optimization

NSH 4305

Abstract: In recent years, the integration of communication efficiency and differential privacy in distributed optimization has gained significant attention, motivated by large-scale applications such as Federated Learning (FL), where both data privacy and efficient communication are critical. This thesis explores the development of novel techniques to address these challenges, with a focus on distributed mean [...]

PhD Thesis Defense
PhD Student
Robotics Institute,
Carnegie Mellon University

Towards a Universal Data Engine for Robotics and Beyond

GHC 4405

Abstract: Robotics researchers have been attempting to extend data-driven breakthroughs in fields like computer vision and language processing into robot learning. However, unlike vision or language domains where massive amounts of data is readily available on the internet, training robotic policies relies on physical and interactive data collected via interacting with the physical world -- [...]

PhD Speaking Qualifier
PhD Student
Robotics Institute,
Carnegie Mellon University

HaptiClay: An Interactive Haptic Interface for Gestured Concretization of Polynomial Functions

NSH 4305

Abstract: In this work we present HaptiClay, a low-cost kinesthetic haptic interface that elevates the understanding of mathematics language by providing embodied non-verbal representations of math concepts. Our interface integrates four key components: a haptic device, a high-level simulation that communicates with a low-level controller for force and position updates, a low-level controller that executes [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Better Standards for Trajectory Forecasting: Data, Evaluation, and Methods

GHC 8102

Abstract: Ensuring pedestrian safety in dynamic environments is a key challenge for autonomous systems, particularly in dynamic, multi-agent environments. Trajectory forecasting plays a central role in enabling these systems to anticipate pedestrian behaviors and respond appropriately. This thesis addresses three core limitations in trajectory forecasting systems which impede safe and robust trajectory forecasting: inadequate evaluation protocols [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Bridging Generative and Discriminative Learning with Diffusion Models

GHC 4405

Abstract: Generative models have advanced significantly, synthesizing photorealistic images, videos, and text. Building on this progress, our work explores the potential of diffusion models to bridge generative and discriminative learning, uncovering new pathways for leveraging their strengths in visual perception tasks. In the first part, we propose Diff-2-in-1, a unified framework for multi-modal data generation [...]

PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Bring Hand to The Air: Towards Universal Aerial Manipulation

NSH 4305

Abstract: Uncrewed Aerial Vehicles (UAVs) have attracted the interest of researchers, industry, and the general public in many applications. Noticing that high-altitude tasks sometimes require active interaction with the environment, there have been more and more works focusing on aerial manipulation recently. Each of them has demonstrated the ability to use a specific aerial manipulator [...]