Computer graphic artists who produce computer-animated movies and games spend much time creating subtle movements such as expressions on faces, gesticulations on bodies and the draping of clothes. A new way of modeling these dynamic objects, developed by researchers at Carnegie Mellon University, Disney Research, Pittsburgh, and the LUMS School of Science and Engineering in Pakistan, could greatly simplify this editing process.
Graphics software usually represents dynamic objects, such as an expressive face, as a sequence of shapes, with each shape composed of a set of points in space. Another way to model an expressive face is to chart each point on the face as it shifts location over time. Each method has its advantages, but the sheer number of possible variations is tremendous, which results in models that are large and difficult to manage.
The Pittsburgh researchers, however, found that they could create a model that simultaneously takes into account both space and time — a bilinear spatiotemporal basis model. Though this approach might sound more complex, the researchers found the contrary. The method enabled them to create a much more compact, powerful and easy-to-manage model. For example, they showed that they could reproduce a dynamic sequence, with millimeter precision, after discarding 99 percent of the original data points.
Their findings will be presented Aug. 6 at SIGGRAPH 2012, the International Conference on Computer Graphics and Interactive Techniques, at the Los Angeles Convention Center.
Yaser Sheikh, assistant research professor in Carnegie Mellon’s Robotics Institute, explained that the natural constraints on spatial movements, such as the characteristic ways that the face changes shape as someone is talking or expressing an emotion, combine with the natural constraints on how much movement can occur over a given stretch of time. This enables the models to be very compact and efficient.
“Simply put, this lets us do things more sensibly with less work,” Sheikh said.
Spatiotemporal data is inherent not only in computer simulations and animations, but in object and camera tracking. So building more efficient models can have a number of practical implications. In motion editing, for instance, the models created with the bilinear spatiotemporal representation make it easy to change one point in space and time — such as bringing the head of a soccer player forward to make contact with a ball — while keeping it consistent with other points in the model, said Tomas Simon, a Robotics Institute Ph.D. student and a Disney Research intern.
Likewise, action sequences based on motion capture data often require tedious post-processing to fix missing markers, incorrectly labeled markers, and other glitches. A sequence that would take two or three hours for a computer graphic artist to process using conventional models could be completed in just a few minutes using the new models, with similar quality, Simon said.
Iain Matthews, senior research scientist at Disney Research, Pittsburgh, said the bilinear spatiotemporal basis models are possible, in part, because today’s computers have memories sufficient to process data sets that can include millions of variables. “The ability to interact with large dynamic sequences in data consistent ways and in real-time has lots of interesting applications,” he added.
In addition to Sheikh, Simon, and Matthews, the research team included Ijaz Akhter, a LUMS School graduate student and Disney Research intern, and Sohaib Khan, head of the LUMS School computer science department.