Event Classification with Imbalanced and Missing Data for an Air-Handling Unit

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

27 Downloads (Pure)


Prediction of faults reliably for air handling units (AHU) is a key aspect of correcting errors and eliminating non-optimal functionality. Machine learning classification methods with data sampling are widely utilized to forecast these kinds of events, which, by their nature, seldom occur in equipment. The model proposed in this paper harnesses seven years of data from an air handling unit that contains information about, for example, temperature, humidity, CO2 content, and fan speed. This paper contributes to the field of imbalanced classification problems by proposing a novel data undersampling algorithm to enhance the classification model results in the presence of imbalanced and missing data. Moreover, this paper compares several oversampling methods, undersampling methods, probability calibration, and machine learning methods. Then, the paper reports on the proposed final model (proposed undersampling Algorithm 1, Tomek Links, and Logistic Regression) to forecast imperfect heat recovery events in an air handling unit that occur relatively seldom. The precision of the final model was 0.93 for the unseen data; this result was reasonable considering the imbalance of data concurring with missing data sequences.
Original languageEnglish
Title of host publication2022 IEEE 5th International Conference on Big Data and Artificial Intelligence (BDAI)
Number of pages5
ISBN (Electronic)978-1-6654-7081-0
Publication statusPublished - 29 Aug 2022
MoE publication typeA4 Conference publication
EventInternational Conference on Big Data and Artificial Intelligence - Fuzhou, China
Duration: 8 Jul 202210 Jul 2022
Conference number: 5


ConferenceInternational Conference on Big Data and Artificial Intelligence
Abbreviated titleBDAI


  • machine learning
  • classification algorithms
  • imbalanced data
  • data preprocessing


Dive into the research topics of 'Event Classification with Imbalanced and Missing Data for an Air-Handling Unit'. Together they form a unique fingerprint.

Cite this