Production-Level Facial Performance Capture Using Deep Convolutional Neural Networks

Samuli Laine, Tero Karras, Timo Aila, Antti Herva, Shunsuke Saito, Ronald Yu, Hao Li, Jaakko Lehtinen

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

68 Citations (Scopus)

Abstract

We present a real-time deep learning framework for video-based facial performance capture---the dense 3D tracking of an actor's face given a monocular video. Our pipeline begins with accurately capturing a subject using a high-end production facial capture pipeline based on multi-view stereo tracking and artist-enhanced animations. With 5--10 minutes of captured footage, we train a convolutional neural network to produce high-quality output, including self-occluded regions, from a monocular video sequence of that subject. Since this 3D facial performance capture is fully automated, our system can drastically reduce the amount of labor involved in the development of modern narrative-driven video games or films involving realistic digital doubles of actors and potentially hours of animated dialogue per character. We compare our results with several state-of-the-art monocular real-time facial capture techniques and demonstrate compelling animation inference in challenging areas such as eyes and lips.
Original languageEnglish
Title of host publicationSCA '17 Proceedings of the ACM SIGGRAPH / Eurographics Symposium on Computer Animation
PublisherACM
Number of pages10
ISBN (Electronic)978-1-4503-5091-4
DOIs
Publication statusPublished - Jul 2017
MoE publication typeA4 Conference publication
EventACM SIGGRAPH / Eurographics Symposium on Computer Animation - University of California, Los Angeles, Los Angeles, United States
Duration: 28 Jul 201730 Jul 2017

Conference

ConferenceACM SIGGRAPH / Eurographics Symposium on Computer Animation
Abbreviated titleSCA
Country/TerritoryUnited States
CityLos Angeles
Period28/07/201730/07/2017

Fingerprint

Dive into the research topics of 'Production-Level Facial Performance Capture Using Deep Convolutional Neural Networks'. Together they form a unique fingerprint.

Cite this