Ensemble Knowledge Transfer for Semantic Segmentation
Abstract
Semantic segmentation networks are usually learned in a strictly supervised manner, i.e., they are trained and tested on similar data distributions. Performance drops drastically in the presence of domain shifts. In this paper, we explore methods for learning across train and test distributions that dramatically differ in scene structure, viewpoints, and objects statistics. Motivated by the proliferation of aerial drone robotics, we consider the target task of semantic segmentation from aerial viewpoints. Inspired by the impact of Cityscapes [11], we introduce AeroScapes, a new dataset of 3269 images of aerial scenes (captured with a fleet of drones) annotated with dense semantic segmentations. Our dataset differs from existing segmentation datasets (that focus on ground-view or indoorscene domains) in terms of viewpoint, scene composition, and object scales. We propose a simple but effective approach for transferring knowledge from such diverse domains (for which considerable annotated training data exists) to our target task. To do so, we train multiple models for aerial segmentation via progressive fine-tuning through each source domain. We then treat these collections of models as an ensemble that can be aggregated to significantly improve performance. We demonstrate large absolute improvements (8.12%) over widely-used standard baselines.
BibTeX
@conference{Nigam-2018-121171,author = {I. Nigam and C. Huang and D. Ramanan},
title = {Ensemble Knowledge Transfer for Semantic Segmentation},
booktitle = {Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV '18)},
year = {2018},
month = {March},
pages = {1499 - 1508},
}