Learning To See In The Dark and Beyond
Abstract
Robotic Perception in diverse domains such as low-light scenarios remains a challenge, even after the incorporation of new sensing modalities like thermal imaging and specialized night-vision sensors. This is primarily due to the difficulty in obtaining labeled data in these new domains across multiple tasks.
In this thesis, we provide a pathway for designing robots that can operate in new visual domains, across different tasks of varying difficulty in labeling. While our work is directed towards operating artificial agents passively at night, it can be extended to other new environments as well. We demonstrate our approach in the critically important and representative
tasks of object detection and semantic segmentation, where the former corresponds to a task where label generation is often feasible, and the latter to a task where it isn’t.
First, we extend the operating range of an object-detection system to enable on-robot low-light operations. We do so by employing a high-sensitivity camera and train an object detection model on it with the aid of labeled in-domain data, and deploy it for on-robot operations, thus extending the operating range of the system to function 24/7.
For the more challenging setting, where generating large quantities of new labels can be prohibitively expensive, we propose a novel label-efficient, and effective Domain Adaptation framework, Almost Unsupervised Domain Adaptation (AUDA), that critically accounts for biases learned by the original model in the source domain, and show it on semantic segmentation.
While existing Domain Adaptation techniques, promise to leverage labels from well-lit RGB image datasets, they fail to consider the characteristics of the source domain itself, such as noise patterns, texture, glare etc. We holistically account for this by proposing Source Preparation (SP), a method to mitigate source domain biases. Our semi-supervised framework for realistic robotic scenarios, AUDA, employs Source Preparation (SP), Unsupervised Domain Adaptation (UDA), and Supervised Alignment (SA) from limited labeled data (∼10s of images) to train models in new domains with limited labeled target data.
Our method outperforms state-of-the-art across a range of visual domains, with improvements of up ∼+40% in mIoU in unsupervised, and ∼+30% in mIoU in semi-supervised scenarios, in addition to a marked increase in robustness to realistic shifts that can occur to the target domain. Finally, we introduce the first ‘intensified’ dataset captured at night time comprising images from an intensifier camera, and a high-sensitivity camera to facilitate low-light robotic operations.
BibTeX
@mastersthesis{Ramesh-2023-137693,author = {Anirudha Ramesh},
title = {Learning To See In The Dark and Beyond},
year = {2023},
month = {August},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-23-43},
keywords = {Domain Adaptation, Low-Light Vision, Semantic Segmentation, Object Detection, Unsupervised Learning, Semi-Supervised Learning},
}