PhD Thesis Proposal
PhD Student
Robotics Institute,
Carnegie Mellon University

Efficient Synthetic Data Generation and Utilization for Action Recognition and Universal Avatar Generation

NSH 3305

Abstract: Human-centered computer vision technology relies heavily on large, diverse datasets, but collecting data from human subjects is time-consuming, labor-intensive, and raises privacy concerns. To address these challenges, researchers are increasingly using synthetic data to augment real-world datasets. This thesis explores efficient methods for generating and utilizing synthetic data to train human-based computer vision models. [...]

MSR Thesis Defense
MSR Student
Robotics Institute,
Carnegie Mellon University

Multi-Resolution Informative Path Planning for Small Teams of Robots

GHC 4405

Abstract: Unmanned aerial vehicles can increase the efficiency of information gathering applications . A key challenge is balancing the search across multiple locations of varying importance while determining the best sensing altitude, given each agent's finite operation time. In this work, we present a multi-resolution informative path planning approach for small teams of unmanned aerial [...]

PhD Thesis Defense
Postdoctoral Fellow
Robotics Institute,
Carnegie Mellon University

Communication-Efficient Active Reconstruction using Self-Organizing Gaussian Mixture Models

GHC 4405

Abstract: For the multi-robot active reconstruction task, this thesis proposes using Gaussian mixture models (GMMs) as the map representation that enables multiple downstream tasks: high-fidelity static scene reconstruction, communication-efficient map sharing, and safe informative planning. A new method called Self-Organizing Gaussian mixture modeling (SOGMM) is proposed that estimates the model complexity (i.e., number of Gaussian [...]

MSR Thesis Defense
MSR Student
Robotics Institute,
Carnegie Mellon University

Vision-Language Models for Hand-Object Interaction Prediction

Rashid Auditorium - 4401 Gates and Hillman Centers

Abstract: How can we predict future interaction trajectories of human hands in a scene given high-level colloquial task specifications in the form of natural language? In this paper, we extend the classic hand trajectory prediction task to two tasks involving explicit or implicit language queries. Our proposed tasks require extensive understanding of human daily activities [...]