Methodological Considerations for Predicting At-risk Students

Charles Koutcheme*, Sami Sarsa, Arto Hellas, Lassi Haaranen, Juho Leinonen

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

8 Citations (Scopus)
79 Downloads (Pure)


Educational researchers have long sought to increase student retention. One stream of research focusing on this seeks to automatically identify students who are at risk of dropping out. Studies tend to agree that earlier identification of at-risk students is better, providing more room for targeted interventions. We looked at the interplay of data and predictive power of machine learning models used to identify at-risk students. We critically examine the often used approach where data collected from weeks 1, 2,..., n is used to predict whether a student becomes inactive in the subsequent weeks w, w ≥ n + 1, pointing out issues with this approach that may inflate models’ predictive power. Specifically, our empirical analysis highlights that including students who have become inactive on week n or before, where n > 1, to the data used to identify students who are inactive on the following weeks is a significant cause of bias. Including students who dropped out during the first week makes the problem significantly easier, since they have no data in the subsequent weeks. Based on our results, we recommend including only active students until week n when building and evaluating models for predicting dropouts in subsequent weeks and evaluating and reporting the particularities of the respective course contexts.
Original languageEnglish
Title of host publicationACE '22: Australasian Computing Education Conference
Number of pages9
ISBN (Electronic)9781450396431
Publication statusPublished - 14 Feb 2022
MoE publication typeA4 Conference publication
EventAustralasian Computing Education Conference - Virtual, Online, Australia
Duration: 14 Feb 202218 Feb 2022
Conference number: 24


ConferenceAustralasian Computing Education Conference
Abbreviated titleACE
CityVirtual, Online
Internet address


Dive into the research topics of 'Methodological Considerations for Predicting At-risk Students'. Together they form a unique fingerprint.

Cite this