TartanVO: A Generalizable Learning-based VO

Conference Paper, Proceedings of the 2020 Conference on Robot Learning, Vol. 155, pp. 1761-1772, 2021

View Publication

Abstract

We present the first learning-based visual odometry (VO) model, which generalizes to multiple datasets and real-world scenarios and outperforms geometry-based methods in challenging scenes. We achieve this by leveraging the SLAM dataset TartanAir, which provides a large amount of diverse synthetic data in challenging environments. Furthermore, to make our VO model generalize across datasets, we propose an up-to-scale loss function and incorporate the camera intrinsic parameters into the model. Experiments show that a single model, TartanVO, trained only on synthetic data, without any finetuning, can be generalized to real-world datasets such as KITTI and EuRoC, demonstrating significant advantages over the geometry-based methods on challenging trajectories. Our code is available at https://github.com/castacks/tartanvo.

BibTeX

@conference{Wang-2021-139824,
author = {Wang W, Hu Y, Scherer S},
title = {TartanVO: A Generalizable Learning-based VO},
booktitle = {Proceedings of the 2020 Conference on Robot Learning},
year = {2021},
month = {January},
editor = {Kober J,Ramos F,Tomlin C},
volume = {155},
series = {Proceedings of Machine Learning Research},
pages = {1761-1772},
publisher = {PMLR},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.