Deep learning for depression recognition with audiovisual cues: A review

Lang He*, Mingyue Niu, Prayag Tiwari, Pekka Marttinen, Rui Su, Jiewei Jiang, Chenguang Guo, Hongyu Wang, Songtao Ding, Zhongmin Wang, Xiaoying Pan, Wei Dang

*Corresponding author for this work

Research output: Contribution to journalReview Articlepeer-review

23 Citations (Scopus)


With the acceleration of the pace of work and life, people are facing more and more pressure, which increases the probability of suffering from depression. However, many patients may fail to get a timely diagnosis due to the serious imbalance in the doctor–patient ratio in the world. A promising development is that physiological and psychological studies have found some differences in speech and facial expression between patients with depression and healthy individuals. Consequently, to improve current medical care, Deep Learning (DL) has been used to extract a representation of depression cues from audio and video for automatic depression detection. To classify and summarize such research, we introduce the databases and describe objective markers for automatic depression estimation. We also review the DL methods for automatic detection of depression to extract a representation of depression from audio and video. Lastly, we discuss challenges and promising directions related to the automatic diagnoses of depression using DL.

Original languageEnglish
Pages (from-to)56-86
Number of pages31
JournalInformation Fusion
Publication statusPublished - Apr 2022
MoE publication typeA2 Review article in a scientific journal


  • Affective computing
  • Automatic depression estimation
  • Deep learning
  • Depression
  • Review


Dive into the research topics of 'Deep learning for depression recognition with audiovisual cues: A review'. Together they form a unique fingerprint.

Cite this