Contrastive Learning for Unpaired Image-to-Image Translation - Robotics Institute Carnegie Mellon University

Contrastive Learning for Unpaired Image-to-Image Translation

Taesung Park, Alexei A. Efros, Richard Zhang, and Jun-Yan Zhu
Conference Paper, Proceedings of (ECCV) European Conference on Computer Vision, pp. 319 - 345, August, 2020

Abstract

In image-to-image translation, each patch in the output should reflect the content of the corresponding patch in the input, independent of domain. We propose a straightforward method for doing so – maximizing mutual information between the two, using a framework based on contrastive learning. The method encourages two elements (corresponding patches) to map to a similar point in a learned feature space, relative to other elements (other patches) in the dataset, referred to as negatives. We explore several critical design choices for making contrastive learning effective in the image synthesis setting. Notably, we use a multilayer, patch-based approach, rather than operate on entire images. Furthermore, we draw negatives from within the input image itself, rather than from the rest of the dataset. We demonstrate that our framework enables one-sided translation in the unpaired image-to-image translation setting, while improving quality and reducing training time. In addition, our method can even be extended to the training setting where each “domain” is only a single image.

Notes
We thank Allan Jabri and Phillip Isola for helpful discussion and feedback. Taesung Park is supported by a Samsung Scholarship and an Adobe Research Fellowship, and some of this work was done as an Adobe Research intern. This work was partially supported by NSF grant IIS-1633310, grant from SAP, and gifts from Berkeley DeepDrive and Adobe.

BibTeX

@conference{Park-2020-125668,
author = {Taesung Park and Alexei A. Efros and Richard Zhang and Jun-Yan Zhu},
title = {Contrastive Learning for Unpaired Image-to-Image Translation},
booktitle = {Proceedings of (ECCV) European Conference on Computer Vision},
year = {2020},
month = {August},
pages = {319 - 345},
keywords = {Contrastive learning, Noise contrastive estimation, Mutual information, Image generation},
}