Abstrakti
Positive emotion is a pre-condition to any sales contract. Likewise, the ability to perceive the emotions of a customer impacts sales performance.To support emotional perception in buyer-seller interactions, we propose an audio-visual emotion recognition system that can recognize eight emotions: neutral, calm, sad, happy, angry, fearful, surprised, and disgusted. We reduced noise in audio samples and we applied transfer learning for image feature extraction based on a pre-trained deep neural network VGG16. For emotion recognition, we successfully obtained an audio emotion-recognition accuracy of 62.51% and 68% and video emotion-recognition accuracy of 97.13% and 97.77% on the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) and Surrey Audio-Visual Expressed Emotion (SAVEE) datasets respectively. For the combination of the two models, our proposed merging mechanism without re-training achieved an accuracy of close to 100% on both datasets. Finally, we demonstrated our system for a customer satisfaction use case in a real customer-to-salesperson interaction using audio and video models, achieving an average accuracy of 78%.
Alkuperäiskieli | Englanti |
---|---|
Otsikko | Proceedings of 16th International Conference on Mobility, Sensing and Networking (MSN 2020) |
Kustantaja | IEEE |
Sivut | 584-591 |
Sivumäärä | 8 |
ISBN (elektroninen) | 978-1-7281-9916-0 |
DOI - pysyväislinkit | |
Tila | Julkaistu - huhtik. 2021 |
OKM-julkaisutyyppi | A4 Artikkeli konferenssijulkaisuussa |
Tapahtuma | International Conference on Mobility, Sensing and Networking - Tokyo, Japani Kesto: 17 jouluk. 2020 → 19 jouluk. 2020 Konferenssinumero: 16 |
Conference
Conference | International Conference on Mobility, Sensing and Networking |
---|---|
Lyhennettä | MSN |
Maa/Alue | Japani |
Kaupunki | Tokyo |
Ajanjakso | 17/12/2020 → 19/12/2020 |