Abstract
Creative sketching or doodling is an expressive activity, where imaginative and previously unseen depictions of everyday visual objects are drawn. Creative sketch image generation is a challenging vision problem, where the task is to generate diverse, yet realistic creative sketches possessing the unseen composition of the visual-world objects. Here, we propose a novel coarse-to-fine two-stage framework, DoodleFormer, that decomposes the creative sketch generation problem into the creation of coarse sketch composition followed by the incorporation of fine-details in the sketch. We introduce graph-aware transformer encoders that effectively capture global dynamic as well as local static structural relations among different body parts. To ensure diversity of the generated creative sketches, we introduce a probabilistic coarse sketch decoder that explicitly models the variations of each sketch body part to be drawn. Experiments are performed on two creative sketch datasets: Creative Birds and Creative Creatures. Our qualitative, quantitative and human-based evaluations show that DoodleFormer outperforms the state-of-the-art on both datasets, yielding realistic and diverse creative sketches. On Creative Creatures, DoodleFormer achieves an absolute gain of 25 in Frèchet inception distance (FID) over state-of-the-art. We also demonstrate the effectiveness of DoodleFormer for related applications of text to creative sketch generation, sketch completion and house layout generation. Code is available at: https://github.com/ankanbhunia/doodleformer.
Original language | English |
---|---|
Title of host publication | Computer Vision – ECCV 2022 - 17th European Conference, Proceedings |
Editors | Shai Avidan, Gabriel Brostow, Moustapha Cissé, Giovanni Maria Farinella, Tal Hassner |
Publisher | SPRINGER |
Pages | 338-355 |
Number of pages | 18 |
ISBN (Print) | 978-3-031-19789-5 |
DOIs | |
Publication status | Published - 2022 |
MoE publication type | A4 Article in a conference publication |
Event | European Conference on Computer Vision - Tel Aviv, Israel Duration: 23 Oct 2022 → 27 Oct 2022 Conference number: 17 https://eccv2022.ecva.net |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Publisher | Springer |
Volume | 13677 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | European Conference on Computer Vision |
---|---|
Abbreviated title | ECCV |
Country/Territory | Israel |
City | Tel Aviv |
Period | 23/10/2022 → 27/10/2022 |
Internet address |