Real-time 3D neural facial animation from binocular video
Abstract
We present a method for performing real-time facial animation of a 3D avatar from binocular video. Existing facial animation methods fail to automatically capture precise and subtle facial motions for driving a photo-realistic 3D avatar "in-the-wild" (i.e., variability in illumination, camera noise). The novelty of our approach lies in a light-weight process for specializing a personalized face model to new environments that enables extremely accurate real-time face tracking anywhere. Our method uses a pre-trained high-fidelity personalized model of the face that we complement with a novel illumination model to account for variations due to lighting and other factors often encountered in-the-wild (e.g., facial hair growth, makeup, skin blemishes). Our approach comprises two steps. First, we solve for our illumination model's parameters by applying analysis-by-synthesis on a short video recording. Using the pairs of model parameters (rigid, non-rigid) and the original images, we learn a regression for real-time inference from the image space to the 3D shape and texture of the avatar. Second, given a new video, we fine-tune the real-time regression model with a few-shot learning strategy to adapt the regression model to the new environment. We demonstrate our system's ability to precisely capture subtle facial motions in unconstrained scenarios, in comparison to competing methods, on a diverse collection of identities, expressions, and real-world environments.
BibTeX
@article{Cao-2021-129823,author = {Chen Cao and Vasu Agrawal and F. De la Torre and Lele Chen and Jason M. Saragih and Tomas Simon and Yaser Sheikh},
title = {Real-time 3D neural facial animation from binocular video},
journal = {ACM Transactions on Graphics (TOG)},
year = {2021},
month = {August},
volume = {40},
number = {4},
}