I2D-Loc: Camera localization via image to LiDAR depth flow
Abstract
Accurate camera localization in existing LiDAR maps is promising since it potentially allows exploiting strengths of both LiDAR-based and camera-based methods. However, effective methods that robustly address appearance and modality differences for 2D–3D localization are still missing. To overcome these problems, we propose the I2D-Loc, a scene-agnostic and end-to-end trainable neural network that estimates the 6-DoF pose from an RGB image to an existing LiDAR map with local optimization on an initial pose. Specifically, we first project the LiDAR map to the image plane according to a rough initial pose and utilize a depth completion algorithm to generate a dense depth image. We further design a confidence map to weight the features extracted from the dense depth to get a more reliable depth representation. Then, we propose to utilize a neural network to estimate the correspondence flow between depth and RGB images. Finally, we utilize the BPnP algorithm to estimate the 6-DoF pose, calculating the gradients of pose error and optimizing the front-end network parameters. Moreover, by decoupling the intrinsic camera parameters out of the end-to-end training process, I2D-Loc can be generalized to the images with different intrinsic parameters. Experiments on KITTI, Argoverse, and Lyft5 datasets demonstrate that the I2D-Loc can achieve centimeter localization performance. The source code, dataset, trained models, and demo videos are released at https://levenberg.github.io/I2D-Loc/.
BibTeX
@article{Chen-2022-139785,author = {Chen K, Yu H, Yang W, Yu L, Scherer S, Xia GS},
title = {I2D-Loc: Camera localization via image to LiDAR depth flow},
journal = {ISPRS Journal of Photogrammetry and Remote Sensing},
year = {2022},
month = {January},
volume = {194},
pages = {209-221},
keywords = {Camera localization, 2D–3D registration, Flow estimation, Depth completion, Neural network},
}