3D Recognition with self-supervised learning and generic architectures

Abstract: Supervised learning relies on manual labeling which scales poorly with the number of tasks and data. Manual labeling is especially cumbersome for 3D recognition tasks such as detection and segmentation and thus most 3D datasets are surprisingly small compared to image or video datasets. 3D recognition methods are also fragmented based on the type [...]