Recovering Surface Layout from an Image - Robotics Institute Carnegie Mellon University

Recovering Surface Layout from an Image

Derek Hoiem, Alexei A. Efros, and Martial Hebert
Journal Article, International Journal of Computer Vision: Special Issue on Celebrating Kanade's Vision, Vol. 75, No. 1, pp. 151 - 172, October, 2007

Abstract

Humans have an amazing ability to instantly grasp the overall 3D structure of a scene?round orientation, relative positions of major landmarks, etc.?ven from a single image. This ability is completely missing in most popular recognition algorithms, which pretend that the world is flat and/or view it through a patch-sized peephole. Yet it seems very likely that having a grasp of this ?urface layout?of a scene should be of great assistance for many tasks, including recognition, navigation, and novel view synthesis. In this paper, we take the first step towards constructing the surface layout, a labeling of the image intogeometric classes. Our main insight is to learn appearance-based models of these geometric classes, which coarsely describe the 3D scene orientation of each image region. Our multiple segmentation framework provides robust spatial support, allowing a wide variety of cues (e.g., color, texture, and perspective) to contribute to the confidence in each geometric label. In experiments on a large set of outdoor images, we evaluate the impact of the individual cues and design choices in our algorithm. We further demonstrate the applicability of our method to indoor images, describe potential applications, and discuss extensions to a more complete notion of surface layout.

BibTeX

@article{Hoiem-2007-9825,
author = {Derek Hoiem and Alexei A. Efros and Martial Hebert},
title = {Recovering Surface Layout from an Image},
journal = {International Journal of Computer Vision: Special Issue on Celebrating Kanade's Vision},
year = {2007},
month = {October},
volume = {75},
number = {1},
pages = {151 - 172},
keywords = {surface layout, spatial layout, geometric context, scene understanding, context, object detection, model-driven segmentation, image understanding, multiple segmentations, object recognition},
}