Unsupervised Learning of Human Activity Grammars from Noisy Input Sequences - Robotics Institute Carnegie Mellon University

Unsupervised Learning of Human Activity Grammars from Noisy Input Sequences

Kris M. Kitani, Yoichi Sato, and Akihiro Sugimoto
Journal Article, IPSJ Transactions on Computer Vision and Image Media, Vol. 1, No. 2, pp. 86 - 99, July, 2008

Abstract

Context-Free Grammars have been shown to be useful for applications beyond natural language analysis, specifically vision-based human activity analysis. However, vision-based symbol strings differ from natural language strings, in that a string of symbols produced by video can contain noise symbols, making grammatical inference very difficult. To obtain reliable results from grammatical inference, it is necessary to identify these noise symbols. We propose a new technique for identifying the best subset of terminal symbols to acquire the best activity grammar. Our approach uses the Minimum Description Length principle, to evaluate the trade-offs between model complexity and data fit to quantify the difference between the results of each terminal subset. The evaluation results are then used to identify of a class of candidate terminal subsets and grammars that remove the noise and enable the discovery of the basic structure of an activity. In this paper, we present the validity of our proposed method based on experiments with artificial and real data.

BibTeX

@article{Kitani-2008-109772,
author = {Kris M. Kitani and Yoichi Sato and Akihiro Sugimoto},
title = {Unsupervised Learning of Human Activity Grammars from Noisy Input Sequences},
journal = {IPSJ Transactions on Computer Vision and Image Media},
year = {2008},
month = {July},
volume = {1},
number = {2},
pages = {86 - 99},
}