Generating Long Videos of Dynamic Scenes

Tim Brooks, Janne Hellsten, Miika Aittala, Ting-Chun Wang, Timo Aila, Jaakko Lehtinen, Ming-Yu Liu, Alexei Efros, Tero Karras

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

Abstract

We present a video generation model that accurately reproduces object motion, changes in camera viewpoint, and new content that arises over time. Existing video generation methods often fail to produce new content as a function of time while maintaining consistencies expected in real environments, such as plausible dynamics and object persistence. A common failure case is for content to never change due to over-reliance on inductive bias to provide temporal consistency, such as a single latent code that dictates content for the entire video. On the other extreme, without long-term consistency, generated videos may morph unrealistically between different scenes. To address these limitations, we prioritize the time axis by redesigning the temporal latent representation and learning long-term consistency from data by training on longer videos. We leverage a two-phase training strategy, where we separately train using longer videos at a low resolution and shorter videos at a high resolution. To evaluate the capabilities of our model, we introduce two new benchmark datasets with explicit focus on long-term temporal dynamics.
Original languageEnglish
Title of host publicationAdvances in Neural Information Processing Systems 35 (NeurIPS 2022)
EditorsS. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, A. Oh
PublisherMorgan Kaufmann Publishers
Number of pages13
ISBN (Print)978-1-7138-7108-8
Publication statusPublished - 2022
MoE publication typeA4 Conference publication
EventConference on Neural Information Processing Systems - New Orleans, United States
Duration: 28 Nov 20229 Dec 2022
Conference number: 36
https://nips.cc/

Publication series

NameAdvances in Neural Information Processing Systems
PublisherMorgan Kaufmann Publishers
Volume35
ISSN (Print)1049-5258

Conference

ConferenceConference on Neural Information Processing Systems
Abbreviated titleNeurIPS
Country/TerritoryUnited States
CityNew Orleans
Period28/11/202209/12/2022
Internet address

Fingerprint

Dive into the research topics of 'Generating Long Videos of Dynamic Scenes'. Together they form a unique fingerprint.

Cite this