Subjective Evaluation of Basic Emotions from Audio–Visual Data

Sudarsana Reddy Kadiri*, Paavo Alku

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

2 Citations (Scopus)
38 Downloads (Pure)


Understanding of the perception of emotions or affective states in humans is important to develop emotion-aware systems that work in realistic scenarios. In this paper, the perception of emotions in naturalistic human interaction (audio–visual data) is studied using perceptual evaluation. For this purpose, a naturalistic audio–visual emotion database collected from TV broadcasts such as soap-operas and movies, called the IIIT-H Audio–Visual Emotion (IIIT-H AVE) database, is used. The database consists of audio-alone, video-alone, and audio–visual data in English. Using data of all three modes, perceptual tests are conducted for four basic emotions (angry, happy, neutral, and sad) based on category labeling and for two dimensions, namely arousal (active or passive) and valence (positive or negative), based on dimensional labeling. The results indicated that the participants’ perception of emotions was remarkably different between the audio-alone, video-alone, and audio–video data. This finding emphasizes the importance of emotion-specific features compared to commonly used features in the development of emotion-aware systems.

Original languageEnglish
Article number4931
Issue number13
Publication statusPublished - 1 Jul 2022
MoE publication typeA1 Journal article-refereed


  • emotion analysis
  • emotion recognition
  • emotion synthesis
  • feature extraction
  • naturalistic audio–visual emotion database


Dive into the research topics of 'Subjective Evaluation of Basic Emotions from Audio–Visual Data'. Together they form a unique fingerprint.

Cite this