TY - JOUR
T1 - CN-waterfall
T2 - a deep convolutional neural network for multimodal physiological affect detection
AU - Fouladgar, Nazanin
AU - Alirezaie, Marjan
AU - Främling, Kary
N1 - Funding Information:
Open access funding provided by Umea University. This research was funded by Umeå University. Additionally, this work was partially supported by the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded by the Knut and Alice Wallenberg Foundation.
Publisher Copyright:
© 2021, The Author(s).
PY - 2022/2
Y1 - 2022/2
N2 - Affective computing solutions, in the literature, mainly rely on machine learning methods designed to accurately detect human affective states. Nevertheless, many of the proposed methods are based on handcrafted features, requiring sufficient expert knowledge in the realm of signal processing. With the advent of deep learning methods, attention has turned toward reduced feature engineering and more end-to-end machine learning. However, most of the proposed models rely on late fusion in a multimodal context. Meanwhile, addressing interrelations between modalities for intermediate-level data representation has been largely neglected. In this paper, we propose a novel deep convolutional neural network, called CN-Waterfall, consisting of two modules: Base and General. While the Base module focuses on the low-level representation of data from each single modality, the General module provides further information, indicating relations between modalities in the intermediate- and high-level data representations. The latter module has been designed based on theoretically grounded concepts in the Explainable AI (XAI) domain, consisting of four different fusions. These fusions are mainly tailored to correlation- and non-correlation-based modalities. To validate our model, we conduct an exhaustive experiment on WESAD and MAHNOB-HCI, two publicly and academically available datasets in the context of multimodal affective computing. We demonstrate that our proposed model significantly improves the performance of physiological-based multimodal affect detection.
AB - Affective computing solutions, in the literature, mainly rely on machine learning methods designed to accurately detect human affective states. Nevertheless, many of the proposed methods are based on handcrafted features, requiring sufficient expert knowledge in the realm of signal processing. With the advent of deep learning methods, attention has turned toward reduced feature engineering and more end-to-end machine learning. However, most of the proposed models rely on late fusion in a multimodal context. Meanwhile, addressing interrelations between modalities for intermediate-level data representation has been largely neglected. In this paper, we propose a novel deep convolutional neural network, called CN-Waterfall, consisting of two modules: Base and General. While the Base module focuses on the low-level representation of data from each single modality, the General module provides further information, indicating relations between modalities in the intermediate- and high-level data representations. The latter module has been designed based on theoretically grounded concepts in the Explainable AI (XAI) domain, consisting of four different fusions. These fusions are mainly tailored to correlation- and non-correlation-based modalities. To validate our model, we conduct an exhaustive experiment on WESAD and MAHNOB-HCI, two publicly and academically available datasets in the context of multimodal affective computing. We demonstrate that our proposed model significantly improves the performance of physiological-based multimodal affect detection.
KW - Data fusion
KW - Deep convolutional neural network
KW - Multimodal affect detection
KW - Physiological-based sensors
UR - http://www.scopus.com/inward/record.url?scp=85115620535&partnerID=8YFLogxK
U2 - 10.1007/s00521-021-06516-3
DO - 10.1007/s00521-021-06516-3
M3 - Article
AN - SCOPUS:85115620535
SN - 0941-0643
VL - 34
SP - 2157
EP - 2176
JO - Neural Computing and Applications
JF - Neural Computing and Applications
IS - 3
ER -