Lexical and function words or language and text type? Abbreviation consistency in an aligned corpus of Latin and Middle English plague tracts

Alpo Honkapohja*, Jukka Suomela

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

1 Citation (Scopus)
16 Downloads (Pure)

Abstract

This study examines the consistency of medieval abbreviation practices in a parallel corpus consisting of Latin and Middle English copies of a plague treatise attributed to John of Burgundy (JB). Focusing on different versions of the treatise enables us to maximize textual and lexical overlap, comparing differences caused by text type, word type, and language. We examine how the following variables affect the consistency of abbreviating across manuscript witnesses: A language: Latin versus English; B text type: recipes versus running text; C word type: lexical versus function words; and D: the number of characters in a word. Variables A-D are compared using a parallel corpus of automatically aligned rich TEI P5 XML-tagged transcriptions of six manuscript witnesses to the JB treatise. The alignment process is based on computer-human collaboration and custom-built alignment tool which uses sections tagged in the TEI XML file and word division. The results reveal that abbreviation was overwhelmingly more consistent in Latin than in the Middle English and somewhat more consistent in recipes. High token counts of frequent lexical items had a major effect on the results. Word length worked better than division into lexical and function words.

Original languageEnglish
Pages (from-to)765–787
JournalDigital Scholarship in the Humanities
Volume37
Issue number3
DOIs
Publication statusPublished - 2022
MoE publication typeA1 Journal article-refereed

Keywords

  • 15TH-CENTURY

Fingerprint

Dive into the research topics of 'Lexical and function words or language and text type? Abbreviation consistency in an aligned corpus of Latin and Middle English plague tracts'. Together they form a unique fingerprint.

Cite this