Towards Universal Place Recognition - Robotics Institute Carnegie Mellon University

Towards Universal Place Recognition

Master's Thesis, Tech. Report, CMU-RI-TR-24-31, August, 2024

Abstract

Place Recognition is a key component for building a robust and reliable SLAM system that enables robot autonomy and global localization in complex scenarios for disaster-response and search-and-rescue tasks. Despite recent advances leveraging large-scale training and improved training techniques for visual,LIDAR and thermal place recognition, current systems remain fragile in are heavily engineered towards specific environments, limiting in-the-wild deployment. At the same time, the recent success of vision foundation models have shown impressive generalized and open-vocabulary behaviour in diverse environments for visual tasks. Building on these key insights, we first demonstrate AnyLoc - a universal solution to Visual Place Recognition that works across diverse structured and unstructured environments without any re-training or fine-tuning. Despite being self-supervised and without any VPR-specific training, we show that aggregating these features helps us achieve up to 4× significantly higher performance than state-of-the-art VPR systems. Furthermore, this features reveal distinct semantic domains corresponding to datasets from similar environments, helping us further improve performance. We further develop MultiLoc and show that these features can be distilled into other modalities, namely LIDAR and thermal enabling cross-modal place recognition even in challenging environments. We evaluate our approach by repurposing existing public datasets for visual-LIDAR-thermal place recognition datasets. For the first time we show that we can achieve zero-shot cross-modal place recognition between unseen modalities at test time. The experiments and analysis in this thesis lays a foundation for building VPR solutions that may be deployed anywhere, anytime,across anyview and on any-sensor.

BibTeX

@mastersthesis{Karhade-2024-142542,
author = {Jay Karhade},
title = {Towards Universal Place Recognition},
year = {2024},
month = {August},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-24-31},
keywords = {Visual Place Recognition, Multi-Modal Place Recognition, Cross-Modal Place Recognition},
}