Image and Depth Coherent Surface Description
Abstract
Description of a surface involves accurate modeling of its geometrical and textural properties. The choice of a surface description depends on both the observations we obtain from the scene and the level of modeling we seek. It can be a piecewise-linear approximation of surface geometry and 2-D texture or a dense point-based approximation of fine-scale geometry and albedo. In both cases, modeled properties of the unknown surface have to best explain observations of the surface such as images, depth samples, or appearance primitives. Re-projections of the 3-D model should be image coherent in that they best account for what is observed in images. Depth values of the modeled structure, as viewed from sensors, should be coherent with those of the real surface. In many computer vision and graphics applications, a surface is described using a texture-mapped mesh of simplicial elements, such as triangles or tetrahedra. The mesh is a piecewise-linear approximation of surface geometry. Vertex points of the mesh are derived from observed 2-D image features or dense 3-D depth data. However, these vertices do not usually sample the surface at its critical points, such as corners and edges. This results in poor modeling of the surface, both geometrically and texturally. We show that the model can be refined or faired by relocating vertices of the mesh to the desired critical points. Relocating a vertex uses texture of the mesh faces common to the vertex. We introduce the idea of EigenFairing, an image-coherent mesh fairing algorithm that exploits the distance-from-feature-space of textured faces to incrementally refine a mesh to best approximate the modeled surface. EigenFairing couples geometrical properties of a 3-D model with its textural properties. A piecewise-linear approximation of a surface does not suffice for modeling accuracy because most surface patches in natural scenes are curved or have fine-scale geometry, e.g., 3-D texture. Points on a surface, as observed in multiple images, are related to each other via the Epipolar constraint. Pixel intensities along epipolar lines for neighboring points that form a 3-D surface patch are used to generate a search space?n isodepth map?o which the surface patch belongs. The 3-D patch can be modeled by a slice of pixel intensities and depth values carved out of the isodepth map. We show that this slice is constrained by the texture of the patch as viewed in multiple images, a property we call the Epitexture constraint. Depth values carved out by the Epitexture constraint are coherent with depth samples of the surface as well. Depth samples of the surface can be either in the form of dense range data or derived from a set of images using stereo or structure-from-motion algorithms. The Epitexture constraint can be combined with texture and structure properties of natural surfaces. These properties regularize the slice by imposing a priori constraints. Regularization is essential if high-quality 3-D models are to be computed from 2-D images. We show modeling results of several natural scenes. Results obtained for a variety of scenes, ranging from building facades to highly cluttered natural environments, demonstrate that the use of image and depth coherence yields high fidelity models.
BibTeX
@phdthesis{Mishra-2005-9174,author = {Pragyana Mishra},
title = {Image and Depth Coherent Surface Description},
year = {2005},
month = {May},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-05-15},
}