Object-level visual SLAM for plant modeling
Abstract
Camera-based Simultaneous Localization and Mapping (SLAM) in agricultural field robotics is challenging due to dynamics, illumination conditions and limited texture inherent in an outdoor environment. We propose a pipeline that combines the recent advances in deep learning with traditional 3D processing techniques to achieve fast and accurate SLAM in vineyards. We use images captured by a stereo camera and their 3D reconstruction to detect objects of interest and divide them into classes: grapes, leaves, and branches. The accuracy of these detections is improved by leveraging information about objects' local neighborhood in 3D. Our method builds a dense 3D model of the scene without any assumption of constant illumination conditions or scene dynamics. This method can be easily generalized to other crops such as oranges and apples with minor modifications in the pipeline.
In addition, we explore two particular applications of 3D modeling in vineyards: grape counting and mapping dormant season grape canes. For counting task, we achieve an F1 score of 0.977 with respect to ground truth grape count. For cane mapping task, we performed cane modeling with a position error below 2cm, to serve as a perception system for a successful cane pruning robot prototype.
BibTeX
@mastersthesis{Nellithimaru-2019-117165,author = {Anjana Kakecochi Nellithimaru},
title = {Object-level visual SLAM for plant modeling},
year = {2019},
month = {August},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-19-27},
keywords = {Visual SLAM, semantic segmentation, grape counting, vineyard mapping, 3D modeling},
}