MonoClothCap: Towards Temporally Coherent Clothing Capture from Monocular RGB Video - Robotics Institute Carnegie Mellon University

MonoClothCap: Towards Temporally Coherent Clothing Capture from Monocular RGB Video

Donglai Xiang, Fabian Prada, Chenglei Wu, and Jessica Hodgins
Conference Paper, Proceedings of International Conference on 3D Vision (3DV '20), pp. 322 - 332, November, 2020

Abstract

We present a method to capture temporally coherent dynamic clothing deformation from a monocular RGB video input. In contrast to the existing literature, our method does not require a pre-scanned personalized mesh template, and thus can be applied to in-the-wild videos. To constrain the output to a valid deformation space, we build statistical deformation models for three types of clothing: T-shirt, short pants and long pants. A differentiable renderer is utilized to align our captured shapes to the input frames by minimizing the difference in both silhouette and texture. We develop a UV texture growing method which expands the visible texture region of the clothing sequentially in order to minimize drift in deformation tracking. We also extract fine-grained wrinkle detail from the input videos by fitting the clothed surface to the normal maps estimated by a convolutional neural network. Our method produces temporally coherent reconstruction of body and clothing from monocular video. We demonstrate successful clothing capture results from a variety of challenging videos. Extensive quantitative experiments demonstrate the effectiveness of our method on metrics including body pose error and surface reconstruction error of the clothing.

BibTeX

@conference{Xiang-2020-126801,
author = {Donglai Xiang and Fabian Prada and Chenglei Wu and Jessica Hodgins},
title = {MonoClothCap: Towards Temporally Coherent Clothing Capture from Monocular RGB Video},
booktitle = {Proceedings of International Conference on 3D Vision (3DV '20)},
year = {2020},
month = {November},
pages = {322 - 332},
}