Visual Motif Discovery via First-Person Vision
Abstract
Visual motifs are images of visual experiences that are significant and shared across many people, such as an image of an informative sign viewed by many people and that of a familiar social situation such as when interacting with a clerk at a store. The goal of this study is to discover visual motifs from a collection of first-person videos recorded by a wearable camera. To achieve this goal, we develop a commonality clustering method that leverages three important aspects: inter-video similarity, intra-video sparseness, and people’s visual attention. The problem is posed as normalized spectral clustering, and is solved e ciently using a weighted covariance matrix. Experimental results suggest the e↵ectiveness of our method over several state-of-the-art methods in terms of both accuracy and e ciency of visual motif discovery.
BibTeX
@conference{Yonetani-2016-109800,author = {Ryo Yonetani and Kris M. Kitani and Yoichi Sato},
title = {Visual Motif Discovery via First-Person Vision},
booktitle = {Proceedings of (ECCV) European Conference on Computer Vision},
year = {2016},
month = {October},
pages = {187 - 203},
}