Few-shot Learning for Segmentation
Abstract
Most learning architectures for segmentation task require a significant amount of data and annotations, especially in the task of segmentation, where each pixel is assigned to a class. Few-shot segmentation aims to replace large amount of training data with only a few densely annotated samples. In this paper, we propose a two-branch network, FuseNet, that can few-shot segment an input image, i.e. query image, given one or multiple images of the target domain, i.e. support images. FuseNet preserves the local context around the target domain by masking out the non-target region in the feature space. The network then leverages the cosine similarity between the masked features from the support and the feature from the query as guidance to predict the segmentation mask. In the case of few-shot, we weigh such guidance differently according to their image-level feature similarity with the query. We also explore the quantitative effects of number of support images on Intersection over Union(IoU). Our network achieves the state-of-the-art result on PASCAL VOC 2012 for both one-shot and five-shot semantic segmentation.
BibTeX
@mastersthesis{Dai-2019-116358,author = {Chia Dai},
title = {Few-shot Learning for Segmentation},
year = {2019},
month = {July},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-19-35},
keywords = {Learning, Semantic Segmentation, One-Shot, Few-Shot, Representation Learning},
}