Windowed Bundle Adjustment Framework for Unsupervised Learning of Monocular Depth Estimation With U-Net Extension and Clip Loss - Robotics Institute Carnegie Mellon University

Windowed Bundle Adjustment Framework for Unsupervised Learning of Monocular Depth Estimation With U-Net Extension and Clip Loss

Lipu Zhou and Michael Kaess
Journal Article, IEEE Robotics and Automation Letters, Vol. 5, No. 2, pp. 3283 - 3290, April, 2020

Abstract

This letter presents a self-supervised framework for learning depth from monocular videos. In particular, the main contributions of this letter include: (1) We present a windowed bundle adjustment framework to train the network. Compared to most previous works that only consider constraints from consecutive frames, our framework increases the camera baseline and introduces more constraints to avoid overfitting. (2) We extend the widely used U-Net architecture by applying a Spatial Pyramid Net (SPN) and a Super Resolution Net (SRN). The SPN fuses information from an image spatial pyramid for the depth estimation, which addresses the context information attenuation problem of the original U-Net. The SRN learns to estimate a high resolution depth map from a low resolution image, which can benefit the recovery of details. (3) We adopt a clip loss function to handle moving objects and occlusions that were solved by designing complicated network or requiring extra information (such as segmentation mask [1]) in previous works. Experimental results show that our algorithm provides state-of-the-art results on the KITTI benchmark.

BibTeX

@article{Zhou-2020-125352,
author = {Lipu Zhou and Michael Kaess},
title = {Windowed Bundle Adjustment Framework for Unsupervised Learning of Monocular Depth Estimation With U-Net Extension and Clip Loss},
journal = {IEEE Robotics and Automation Letters},
year = {2020},
month = {April},
volume = {5},
number = {2},
pages = {3283 - 3290},
}