A study of keystroke data in two contexts: written language and programming language influence predictability of learning outcomes

John Edwards, Juho Leinonen, Arto Hellas

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

18 Citations (Scopus)
143 Downloads (Pure)

Abstract

We study programming process data from two introductory programming courses. Between the course contexts, the programming languages differ, the teaching approaches differ, and the spoken languages differ. In both courses, students' keystroke data timestamps and the pressed keys are recorded as students work on programming assignments.We study how the keystroke data differs between the contexts, and whether research on predicting course outcomes using keystroke latencies generalizes to other contexts. Our results show that there are differences between the contexts in terms of frequently used keys, which can be partially explained by the differences between the spoken languages and the programming languages. Further, our results suggest that programming process data that can be collected non-intrusive in-situ can be used for predicting course outcomes in multiple contexts. The predictive power, however, varies between contexts possibly because the frequently used keys differ between programming languages and spoken languages. Thus, context-specific fine-tuning of predictive models may be needed.

Original languageEnglish
Title of host publicationSIGCSE 2020 - Proceedings of the 51st ACM Technical Symposium on Computer Science Education
PublisherACM
Pages413-419
Number of pages7
ISBN (Electronic)9781450367936
DOIs
Publication statusPublished - 26 Feb 2020
MoE publication typeA4 Conference publication
EventACM Technical Symposium on Computer Science Education - Portland, United States
Duration: 11 Mar 202014 Mar 2020
Conference number: 51

Publication series

NameAnnual Conference on Innovation and Technology in Computer Science Education
ISSN (Print)1942-647X

Conference

ConferenceACM Technical Symposium on Computer Science Education
Abbreviated titleSIGSE
Country/TerritoryUnited States
CityPortland
Period11/03/202014/03/2020

Keywords

  • Digraphs
  • Educational data mining
  • Keystroke analysis
  • Keystroke data
  • Predicting performance
  • Programming process data

Fingerprint

Dive into the research topics of 'A study of keystroke data in two contexts: written language and programming language influence predictability of learning outcomes'. Together they form a unique fingerprint.

Cite this