Multi-node Training for StyleGAN2

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Abstract

StyleGAN2 is a Tensorflow-based Generative Adversarial Network (GAN) framework that represents the state-of-the-art in generative image modelling. The current release of StyleGAN2 implements multi-GPU training via Tensorflow’s device contexts which limits data parallelism to a single node. In this work, a data-parallel multi-node training capability is implemented in StyleGAN2 via Horovod which enables harnessing the compute capability of larger cluster architectures. We demonstrate that the new Horovod-based communication outperforms the previous context approach on a single node. Furthermore, we demonstrate that the multi-node training does not compromise the accuracy of StyleGAN2 for a constant effective batch size. Finally, we report strong and weak scaling of the new implementation up to 64 NVIDIA Tesla A100 GPUs distributed across eight NVIDIA DGX A100 nodes, demonstrating the utility of the approach at scale.

Original languageEnglish
Title of host publicationPattern Recognition. ICPR International Workshops and Challenges, 2021, Proceedings
EditorsAlberto Del Bimbo, Rita Cucchiara, Stan Sclaroff, Giovanni Maria Farinella, Tao Mei, Marco Bertini, Hugo Jair Escalante, Roberto Vezzani
PublisherSpringer Science and Business Media Deutschland GmbH
Pages677-684
Number of pages8
ISBN (Print)9783030687625
DOIs
Publication statusPublished - 2021
MoE publication typeA4 Article in a conference publication
EventInternational Conference on Pattern Recognition - Milan, Italy
Duration: 10 Jan 202111 Jan 2021
Conference number: 25

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PublisherSpringer
Volume12661 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceInternational Conference on Pattern Recognition
Abbreviated titleICPR
CountryItaly
CityMilan
Period10/01/202111/01/2021

Keywords

  • GAN
  • GPU
  • Massively parallel architectures
  • Multi-node training
  • StyleGAN2

Fingerprint

Dive into the research topics of 'Multi-node Training for StyleGAN2'. Together they form a unique fingerprint.

Cite this