A Multi-view Synthetic and Real-world Human Activity Recognition Dataset - Robotics Institute Carnegie Mellon University
Loading Events

PhD Speaking Qualifier

March

24
Fri
Emily Kim PhD Student Robotics Institute,
Carnegie Mellon University
Friday, March 24
9:00 am to 10:00 am
NSH 3305
A Multi-view Synthetic and Real-world Human Activity Recognition Dataset

Abstract:
Advancements in Human Activity Recognition (HAR) partially relies on the creation of datasets that cover a broad range of activities under various conditions. Unfortunately, obtaining and labeling datasets containing human activity is complex, laborious, and costly. One way to mitigate these difficulties with sufficient generality to provide robust activity recognition on unseen data is to replace or supplement real-world data with synthetic data. In this paper, we present a new activity recognition dataset for eleven activity classes with both ground level and aerial camera views in several outdoor environments. In addition to annotated real-world data, we provide several synthetic versions of the dataset constructed with two rendering methods (traditional computer graphics or a learned image-based network) animated with motion capture data. We trained a number of HAR models on our dataset and compared the performance of the real data and the various forms of synthetic data. Our results indicate that the synthetic data alone can provide similar performance to real one but cannot outperform it. Furthermore, a model pretrained on synthetic data and fine-tuned on limited amounts of real data can surpass the individual performance of the real and synthetic domains. Computer graphics (CG)-base rendering yields higher classification accuracy than the image-based network data generation method. Lastly, a model trained on CG-generated aerial view synthetic data is more robust than the model trained on the real data against camera viewpoint changes.

Committee:
Jessica Hodgins (Chair)
Fernando De la Torre
Jun-Yan Zhu
Cherie Ho