Abstract
Positive emotion is a pre-condition to any sales contract. Likewise, the ability to perceive the emotions of a customer impacts sales performance.To support emotional perception in buyer-seller interactions, we propose an audio-visual emotion recognition system that can recognize eight emotions: neutral, calm, sad, happy, angry, fearful, surprised, and disgusted. We reduced noise in audio samples and we applied transfer learning for image feature extraction based on a pre-trained deep neural network VGG16. For emotion recognition, we successfully obtained an audio emotion-recognition accuracy of 62.51% and 68% and video emotion-recognition accuracy of 97.13% and 97.77% on the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) and Surrey Audio-Visual Expressed Emotion (SAVEE) datasets respectively. For the combination of the two models, our proposed merging mechanism without re-training achieved an accuracy of close to 100% on both datasets. Finally, we demonstrated our system for a customer satisfaction use case in a real customer-to-salesperson interaction using audio and video models, achieving an average accuracy of 78%.
Original language | English |
---|---|
Title of host publication | Proceedings of 16th International Conference on Mobility, Sensing and Networking (MSN 2020) |
Publisher | IEEE |
Pages | 584-591 |
Number of pages | 8 |
ISBN (Electronic) | 978-1-7281-9916-0 |
DOIs | |
Publication status | Published - Apr 2021 |
MoE publication type | A4 Conference publication |
Event | International Conference on Mobility, Sensing and Networking - Tokyo, Japan Duration: 17 Dec 2020 → 19 Dec 2020 Conference number: 16 |
Conference
Conference | International Conference on Mobility, Sensing and Networking |
---|---|
Abbreviated title | MSN |
Country/Territory | Japan |
City | Tokyo |
Period | 17/12/2020 → 19/12/2020 |
Keywords
- Customer satisfaction
- Deep learning
- Emotion recognition
- Internet of Things
- Transfer learning