Realistic text replacement with non-uniform style conditioning

Arseny Nerinovsky*, Igor Buzhinsky, Andrey Filchenkov

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review


In this work, we study the possibility of realistic text replacement. The goal of realistic text replacement is to replace text present in the image with user-supplied text. The replacement should be performed in a way that will not allow distinguishing the resulting image from the original one. We achieve this goal by developing a novel non-uniform style conditioning layer and apply it to an encoder-decoder ResNet based architecture. The resulting model is a single-stage model, with no post-processing. We train the model with a combination of adversarial, style, content and L1 losses. Qualitative and quantitative evaluations show that the model achieves realistic text replacement and outperforms existing approaches in multilingual and challenging scenarios. Quantitative evaluation is performed with direct metrics, like SSIM and PSNR and proxy metrics based on the performance of a text recognition model. The proposed model has several potential applications in artificial reality.

Original languageEnglish
JournalIEEE Access
Publication statusE-pub ahead of print - Apr 2021
MoE publication typeA1 Journal article-refereed


  • GAN
  • Style conditioning
  • Text replacement

Fingerprint Dive into the research topics of 'Realistic text replacement with non-uniform style conditioning'. Together they form a unique fingerprint.

Cite this