Name: Data Attribution for Text-to-Image Models
Start: 2024-11-20T15:30:00-05:00
End: 2024-11-20T17:00:00-05:00
Location: NSH 4305

Sheng-Yu Wang PhD Student Robotics Institute,
Carnegie Mellon University

Wednesday, November 20
3:30 pm to 5:00 pm
NSH 4305

Data Attribution for Text-to-Image Models

Abstract:
Large text-to-image models learn from training data to synthesize “novel” images, but how the models use the training data remains a mystery. The problem of data attribution is to identify which training images are influential for generating a given output. Specifically, removing influential images and retraining the model would prevent it from reproducing that output image. Unfortunately, directly searching for these “ground truth” influential images is computationally infeasible since it would require repeatedly retraining from scratch.

My research aims to develop effective and scalable attribution methods and evaluation schemes for large text-to-image models. First, I present a computationally feasible attribution benchmark for large text-to-image models. Through “customization” methods, we define ground truth attribution labels by creating synthetic images computationally influenced by exemplar images. This scheme allows efficient evaluation by avoiding retraining repeatedly. Next, I will present a new data attribution approach for general text-to-image models. We simulate unlearning the synthesized image, find training images that are forgotten after the unlearning process, and label these as influential.

Finally, I will present ongoing work on improving the efficiency of attribution algorithms and propose a future research direction for developing interpretable attribution algorithms.

Thesis Committee Members:
Jun-Yan Zhu, Chair
Deva Ramanan
Ruslan Salakhutdinov
Alexei A. Efros, UC Berkeley
David Bau, Northeastern

PhD Thesis Proposal

November

Event Navigation

PhD Thesis Proposal

November

Share This Event!

Event Navigation