Describing UI Screenshots in Natural Language

Luis A. Leiva, Asutosh Hota, Antti Oulasvirta

Tutkimustuotos: LehtiartikkeliArticleScientificvertaisarvioitu

2 Sitaatiot (Scopus)
82 Lataukset (Pure)

Abstrakti

Being able to describe any user interface (UI) screenshot in natural language can promote understanding of the main purpose of the UI, yet currently it cannot be accomplished with state-of-the-art captioning systems. We introduce XUI, a novel method inspired by the global precedence effect to create informative descriptions of UIs, starting with an overview and then providing fine-grained descriptions about the most salient elements. XUI builds upon computational models for topic classification, visual saliency prediction, and natural language generation (NLG). XUI provides descriptions with up to three different granularity levels that, together, describe what is in the interface and what the user can do with it. We found that XUI descriptions are highly readable, are perceived to accurately describe the UI, and score similarly to human-generated UI descriptions. XUI is available as open-source software.

AlkuperäiskieliEnglanti
Artikkeli19
Sivumäärä28
JulkaisuACM Transactions on Intelligent Systems and Technology
Vuosikerta14
Numero1
DOI - pysyväislinkit
TilaJulkaistu - 9 marrask. 2022
OKM-julkaisutyyppiA1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä

Sormenjälki

Sukella tutkimusaiheisiin 'Describing UI Screenshots in Natural Language'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä