Lexical and function words or language and text type? Abbreviation consistency in an aligned corpus of Latin and Middle English plague tracts

Alpo Honkapohja*, Jukka Suomela

*Tämän työn vastaava kirjoittaja

Tutkimustuotos: LehtiartikkeliArticleScientificvertaisarvioitu

1 Sitaatiot (Scopus)
69 Lataukset (Pure)

Abstrakti

This study examines the consistency of medieval abbreviation practices in a parallel corpus consisting of Latin and Middle English copies of a plague treatise attributed to John of Burgundy (JB). Focusing on different versions of the treatise enables us to maximize textual and lexical overlap, comparing differences caused by text type, word type, and language. We examine how the following variables affect the consistency of abbreviating across manuscript witnesses: A language: Latin versus English; B text type: recipes versus running text; C word type: lexical versus function words; and D: the number of characters in a word. Variables A-D are compared using a parallel corpus of automatically aligned rich TEI P5 XML-tagged transcriptions of six manuscript witnesses to the JB treatise. The alignment process is based on computer-human collaboration and custom-built alignment tool which uses sections tagged in the TEI XML file and word division. The results reveal that abbreviation was overwhelmingly more consistent in Latin than in the Middle English and somewhat more consistent in recipes. High token counts of frequent lexical items had a major effect on the results. Word length worked better than division into lexical and function words.

AlkuperäiskieliEnglanti
Sivut765–787
JulkaisuDigital Scholarship in the Humanities
Vuosikerta37
Numero3
DOI - pysyväislinkit
TilaJulkaistu - 2022
OKM-julkaisutyyppiA1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä

Sormenjälki

Sukella tutkimusaiheisiin 'Lexical and function words or language and text type? Abbreviation consistency in an aligned corpus of Latin and Middle English plague tracts'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä