Semantic video segmentation using both appearance and geometric information
Abstract
The segmentation is the first step and core technology for semantic understanding of the video. Many tasks in the computer vision such as tracking, recognition and 3D reconstruction, etc. rely on the segmentation result as preprocessing. However, the video segmentation has been known to be a very complicated and hard problem. The objects in the video change their colors and shapes according to the surrounding illumination, the camera position, or the object motion. The color, motion, or depth has been utilized individually as a key clue for the segmentation in many researches. However, every object in the image is composed of several features such as color, texture, depth and motion. That is why single-feature based segmentation method often fails. Humans can segment the objects in video with ease because the human visual system enables to consider color, texture, depth and motion at the same time. In this paper, we propose the video segmentation algorithm which is motivated by the human visual system. The algorithm performs the video segmentation task by simultaneously utilizing the color histogram of the color, the optical flow of the motion, and the homography of the structure. Our results show that the proposed algorithm outperforms other appearance based segmentation method in terms of semantic quality of the segmentation [15]. The proposed segmentation method will serve as a basis for better high-level tasks such as recognition, tracking [3],[4] and video understanding [1].
BibTeX
@article{Woo-2015-109768,author = {Jihwan Woo and Kris Kitani and Sehoon Kim and Hantak Kwak and Woosung Shim},
title = {Semantic video segmentation using both appearance and geometric information},
journal = {SPIE Intelligent Robots and Computer Vision XXXII: Algorithms and Techniques},
year = {2015},
month = {February},
volume = {9406},
}