Carnegie Mellon University
12:30 pm to 1:30 pm
Newell-Simon Hall 3305
Abstract:
One of the most fundamental tasks for any robotics application is the ability to adequately assimilate and respond to incoming sensor data. In the case of 3D range sensing, modern-day sensors generate massive quantities of point cloud data that strain available computational resources. Dealing with large quantities of unevenly sampled 3D point data is a great challenge for many fields, including autonomous driving, 3D manipulation, augmented reality, and medical imaging. This thesis explores how carefully designed statistical models for point cloud data can facilitate, accelerate, and unify many common tasks in the area of range-based 3D perception. We first establish a novel family of compact generative models for 3D point cloud data, offering them as an efficient and robust statistical alternative to traditional point-based or voxel-based data structures. We then show how these statistical models can be utilized toward the creation of a unified data processing architecture for tasks such as segmentation, registration, visualization, and mapping.
In complex robotics systems, it is common for various concurrent perceptual processes to have separate low-level data processing pipelines. Besides introducing redundancy, these processes may perform their own data processing in conflicting or ad hoc ways. To avoid this, tractable data structures and models need to be established that share common perceptual processing elements. Additionally, given that many robotics applications involving point cloud processing are size, weight, and power-constrained, these models and their associated algorithms should be deployable in low-power embedded systems while retaining acceptable performance. Given a properly flexible and robust point processor, therefore, many low-level tasks could be unified under a common architectural paradigm and greatly simplify the overall perceptual system.
In this thesis, a family of compact generative models is introduced for point cloud data based on hierarchical Gaussian Mixture Models. Using recursive, data-parallel variants of the Expectation Maximization algorithm, we construct high fidelity statistical and hierarchical point cloud models that compactly represent the data as a 3D generative probability distribution. In contrast to raw points or voxel-based decompositions, our proposed statistical model provides a better theoretical footing for robustly dealing with noise, constructing maximum likelihood methods, reasoning probabilistically about free space, utilizing spatial sampling techniques, and performing gradient-based optimizations. Further, the construction of the model as a spatial hierarchy allows for Octree-like logarithmic time access. One challenge compared to previous methods, however, is that our model-based approach incurs a potentially high creation cost. To mitigate this problem, we leverage data parallelism in order to design models well-suited for GPU acceleration, allowing them to run at rate for many time-critical applications. We show how our models can facilitate various 3D perception tasks, demonstrating state-of-the-art performance in geometric segmentation, registration, dynamic occupancy map creation, and 3D visualization.
Thesis Committee Members:
Alonzo Kelly, Chair
Martial Hebert
Srinivasa Narasimhan
Jan Kautz, NVIDIA