TY - JOUR
T1 - Subjective Evaluation of Basic Emotions from Audio–Visual Data
AU - Kadiri, Sudarsana Reddy
AU - Alku, Paavo
N1 - Funding Information:
Funding: This research was partly funded by Academy of Finland grant number 313390.
Publisher Copyright:
© 2022 by the authors. Licensee MDPI, Basel, Switzerland.
PY - 2022/7/1
Y1 - 2022/7/1
N2 - Understanding of the perception of emotions or affective states in humans is important to develop emotion-aware systems that work in realistic scenarios. In this paper, the perception of emotions in naturalistic human interaction (audio–visual data) is studied using perceptual evaluation. For this purpose, a naturalistic audio–visual emotion database collected from TV broadcasts such as soap-operas and movies, called the IIIT-H Audio–Visual Emotion (IIIT-H AVE) database, is used. The database consists of audio-alone, video-alone, and audio–visual data in English. Using data of all three modes, perceptual tests are conducted for four basic emotions (angry, happy, neutral, and sad) based on category labeling and for two dimensions, namely arousal (active or passive) and valence (positive or negative), based on dimensional labeling. The results indicated that the participants’ perception of emotions was remarkably different between the audio-alone, video-alone, and audio–video data. This finding emphasizes the importance of emotion-specific features compared to commonly used features in the development of emotion-aware systems.
AB - Understanding of the perception of emotions or affective states in humans is important to develop emotion-aware systems that work in realistic scenarios. In this paper, the perception of emotions in naturalistic human interaction (audio–visual data) is studied using perceptual evaluation. For this purpose, a naturalistic audio–visual emotion database collected from TV broadcasts such as soap-operas and movies, called the IIIT-H Audio–Visual Emotion (IIIT-H AVE) database, is used. The database consists of audio-alone, video-alone, and audio–visual data in English. Using data of all three modes, perceptual tests are conducted for four basic emotions (angry, happy, neutral, and sad) based on category labeling and for two dimensions, namely arousal (active or passive) and valence (positive or negative), based on dimensional labeling. The results indicated that the participants’ perception of emotions was remarkably different between the audio-alone, video-alone, and audio–video data. This finding emphasizes the importance of emotion-specific features compared to commonly used features in the development of emotion-aware systems.
KW - emotion analysis
KW - emotion recognition
KW - emotion synthesis
KW - feature extraction
KW - naturalistic audio–visual emotion database
UR - http://www.scopus.com/inward/record.url?scp=85132981539&partnerID=8YFLogxK
U2 - 10.3390/s22134931
DO - 10.3390/s22134931
M3 - Article
AN - SCOPUS:85132981539
SN - 1424-8220
VL - 22
JO - Sensors
JF - Sensors
IS - 13
M1 - 4931
ER -