Abstract:
Autonomous navigation in human crowds (i.e., social navigation) presents several challenges: The robot often needs to rely on its noisy sensors to identify and localize pedestrians in human crowds; the robot needs to plan efficient paths to reach its goals; the robot needs to do so in a safe and socially appropriate manner. Recent work has proposed model-based methods with an emphasis on modeling specific interaction scenarios and learning-based methods to tackle the navigation problem end-to-end. Model-based methods lack adaptation in complex crowded environments, while learning-based methods do not have access to large complex datasets and can only be trained in unrealistic simulators.
In this thesis, we focus on the novel angle of leveraging pedestrian groups to address the social navigation problem. We first introduce the concept of social group space via group split and merge predictions and formulate a model for group state predictions. We further show that split and merge predictions on group-based representations are more accurate than predictions made on individual-based representations. Second, we integrate our group-based representations and prediction models into a Model Predictive Control (MPC) framework. We show that compared to individual-based representations in the same MPC framework, our framework produces safer and more social motions. This demonstrates the benefit of model-based methods when coupled with a learning-based state predictor. Third, we propose a simplified representation of the social group space based on the visible edges of the groups. We show that the simplified representation can replace our original representation in an MPC framework by maintaining similar performance levels while significantly reducing computation time.
In parallel to these contributions, we address the need for real-world large-scale pedestrian datasets in training learning-based methods for social navigation. We also identify a similar need to capture greater varieties of group-based pedestrian interactions. In response to these needs, we introduce our own scalable data collection efforts and dataset: the TBD Pedestrian Dataset. Our data collection pipeline enables efficient collection and labeling of large quantities of data. Our publicly available dataset contains both top-down and ego-centric view sensor data and is much larger than similar prior datasets. This contribution will dramatically advance the work on social robot navigation.
Thesis Committee Members:
Aaron Steinfeld, Chair
Jean Oh
Katia Sycara
Takayuki Kanda, Kyoto University