PhD Thesis Proposal
Data Attribution for Text-to-Image Models
Abstract: Large text-to-image models learn from training data to synthesize "novel" images, but how the models use the training data remains a mystery. The problem of data attribution is to identify which training images are influential for generating a given output. Specifically, removing influential images and retraining the model would prevent it from reproducing that [...]
Knowledge and Data Dependence in Decision-Making
Abstract: This thesis explores diverse decision-making strategies for autonomous agents by examining knowledge-dependent and data-dependent approaches in stationary and dynamic data environments. We address five core research problems across three thematic areas: knowledge-dependent, stationary data-dependent, and evolving data-dependent decision-making. We first investigate knowledge-driven decision-making within robotic swarms, characterizing vulnerabilities in systems governed by consistent rule-following [...]
Communication Efficient and Differentially Private Optimization
Abstract: In recent years, the integration of communication efficiency and differential privacy in distributed optimization has gained significant attention, motivated by large-scale applications such as Federated Learning (FL), where both data privacy and efficient communication are critical. This thesis explores the development of novel techniques to address these challenges, with a focus on distributed mean [...]
Better Standards for Trajectory Forecasting: Data, Evaluation, and Methods
Abstract: Ensuring pedestrian safety in dynamic environments is a key challenge for autonomous systems, particularly in dynamic, multi-agent environments. Trajectory forecasting plays a central role in enabling these systems to anticipate pedestrian behaviors and respond appropriately. This thesis addresses three core limitations in trajectory forecasting systems which impede safe and robust trajectory forecasting: inadequate evaluation protocols [...]
Bridging Generative and Discriminative Learning with Diffusion Models
Abstract: Generative models have advanced significantly, synthesizing photorealistic images, videos, and text. Building on this progress, our work explores the potential of diffusion models to bridge generative and discriminative learning, uncovering new pathways for leveraging their strengths in visual perception tasks. In the first part, we propose Diff-2-in-1, a unified framework for multi-modal data generation [...]
Bring Hand to The Air: Towards Universal Aerial Manipulation
Abstract: Uncrewed Aerial Vehicles (UAVs) have attracted the interest of researchers, industry, and the general public in many applications. Noticing that high-altitude tasks sometimes require active interaction with the environment, there have been more and more works focusing on aerial manipulation recently. Each of them has demonstrated the ability to use a specific aerial manipulator [...]
Spatial Reasoning and Semantic Representations for Intelligent Multi-Robot Exploration and Navigation
Abstract: Autonomous robot exploration is widely applied in areas such as search and rescue, environmental monitoring, and structural inspection. Multi-robot exploration has garnered significant attention in the robotics research community, as it enables faster task completion and greater coverage than a single robot can achieve. However, it presents unique challenges: behavior coordination is complex, communication [...]
Leveraging Sense of Agency to Improve the Experience of Control Over Assistive Robots
Abstract: In an age of autonomous driving and robotics, we are increasingly engaging with robots that deploy autonomous assistance. Cognitive science and human-computer interaction literature tells us that, when we apply autonomy in assistive settings, we are often augmenting the user's sense of agency over the system. Sense of agency is a phenomenon from cognitive [...]
Efficient Synthetic Data Generation and Utilization for Action Recognition and Universal Avatar Generation
Abstract: Human-centered computer vision technology relies heavily on large, diverse datasets, but collecting data from human subjects is time-consuming, labor-intensive, and raises privacy concerns. To address these challenges, researchers are increasingly using synthetic data to augment real-world datasets. This thesis explores efficient methods for generating and utilizing synthetic data to train human-based computer vision models. [...]