TartanVO: A Generalizable Learning-based VO - Robotics Institute Carnegie Mellon University

TartanVO: A Generalizable Learning-based VO

Wang W, Hu Y, and Scherer S
Conference Paper, Proceedings of the 2020 Conference on Robot Learning, Vol. 155, pp. 1761-1772, 2021

Abstract

We present the first learning-based visual odometry (VO) model, which generalizes to multiple datasets and real-world scenarios and outperforms geometry-based methods in challenging scenes. We achieve this by leveraging the SLAM dataset TartanAir, which provides a large amount of diverse synthetic data in challenging environments. Furthermore, to make our VO model generalize across datasets, we propose an up-to-scale loss function and incorporate the camera intrinsic parameters into the model. Experiments show that a single model, TartanVO, trained only on synthetic data, without any finetuning, can be generalized to real-world datasets such as KITTI and EuRoC, demonstrating significant advantages over the geometry-based methods on challenging trajectories. Our code is available at https://github.com/castacks/tartanvo.

BibTeX

@conference{Wang-2021-139824,
author = {Wang W, Hu Y, Scherer S},
title = {TartanVO: A Generalizable Learning-based VO},
booktitle = {Proceedings of the 2020 Conference on Robot Learning},
year = {2021},
month = {January},
editor = {Kober J,Ramos F,Tomlin C},
volume = {155},
series = {Proceedings of Machine Learning Research},
pages = {1761-1772},
publisher = {PMLR},
}