Customizing Large-scale Text-to-Image Models - Robotics Institute Carnegie Mellon University
Loading Events

PhD Speaking Qualifier

November

28
Tue
Nupur Kumari PhD Student Robotics Institute,
Carnegie Mellon University
Tuesday, November 28
2:00 pm to 3:00 pm
NSH 4305
Customizing Large-scale Text-to-Image Models

Abstract:
Advancements in large-scale generative models represent a watershed moment. These models can generate a wide variety of objects and scenes with different styles and compositions. However, these models are trained on a fixed snapshot of available data and often contain copyrighted or private images. This assumption makes them lacking in two aspects – (a) As end users, we often wish to synthesize specific concepts from our personal lives, which these models can’t generate with sufficient fidelity, and (b) These models can often generate unsafe and copyright images, e.g., work of contemporary artists and memorized training set images. We propose to tackle these issues by providing the flexibility to (a) add new user-defined concepts to the model and (b) remove existing concepts from the model. For both tasks, we focus on efficiently fine-tuning the model with its corresponding training objective so as to minimally affect text-to-image synthesis on unrelated concepts.

Committee:
Prof. Jun-Yan Zhu (Chair)
Prof. Deva Ramanan
Prof. Shubham Tulsiani
Jason Zhang