4:00 pm to 12:00 am
Event Location: NSH 1507
Bio: Kris M. Kitani received his B.S. in electrical engineering from the
University of Southern California in 1999. From 2000 to 2003 he worked
with automated visual defect inspection at KLA-Tencor and later earned
his M.S. and Ph.D. in information and communication engineering from the
University of Tokyo in 2005 and 2008, respectively. Dr. Kitani is an
assistant professor at the University of Electro-communications in Tokyo
and an adjunct researcher at the University of Tokyo. He is currently, a
visiting researcher at the University of California at San Diego. His
research topics include syntactic modeling and unsupervised learning for
human activity analysis, pedestrian tracking and applications of
computer vision to HCI.
Abstract: Unsupervised approaches for learning human activities and human actions
from video can be useful when dealing with large datasets where
sufficient amounts of labeled data are unavailable or impractical to
obtain. A syntactic approach to modeling high-level human activities
will be presented, that shows that essential hierarchical structures of
human activities, in the form of a stochastic context-free grammar, can
be recovered using the minimum description length (MDL) principle.
Experiments results show that compact activity grammars can be learned
from data that has been corrupted by sensor noise. Furthermore, a
probabilistic mixture model that learns primitive action categories
without labeled training data will be presented along with a Dirichlet
process based Bayesian equivalent that is applied to first-person action
category learning. It will be shown how the use of key modalities of
image features can help to improve learning performance.