Carnegie Mellon University
Title: ARC: AdveRsarial Calibration between Modalities
Abstract:
Advances in computer vision and machine learning techniques have led to flourishing success in RGB-input perception tasks, which has also opened unbounded possibilities for non-RGB-input perception tasks, such as object detection from wireless signals, point clouds, and infrared light.
However, compared to the matured development pipeline of RGB-input (source modality) models, developing non-RGB-input (target-modality) models from scratch poses excessive challenges in the modality-specific networks/training-tricks design and labor in the target-modality data collection/annotation.
In this thesis, the AdveRsarial Calibration (ARC) is proposed as an efficient pipeline for calibrating target-modality inputs to matured DNN models developed on the source modality. Under ARC, a target-modality-input model is simply composed by adding a small calibrator module ahead of an existing source-modality model. Our ARC training techniques require as little as zero manual annotation on the target modality while producing comparable or better metrics than baseline target models that require 100% manual annotations. We present the ARC components that enable us to achieve the above goals: (1) model inversion to synthesize inverted images from the source-modality model, (2) paired {target, source} data with zero manual annotations (3) Foreground Semantics Reconstruction, (4) Decayed Semantic Supervision and (5) Skipped Inverted Attention,
We demonstrate the effectiveness of ARC by composing the WiFi-input, Lidar-input, and Thermal-Infrared-input models upon the pre-trained RGB-input models respectively.
Committee:
Prof. Fernando De La Torre, (chair)
Dr. Dong Huang, (chair)
Prof. Alexander G. Hauptmann
Zhengyi Luo