research-article

A Multimodal Framework for Large-Scale Emotion Recognition by Fusing Music and Electrodermal Activity Signals

Authors:

Guanghao Yin,

Shouqian Sun,

Dian Yu,

Dejian Li,

Kejun ZhangAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 18, Issue 3

Article No.: 78, Pages 1 - 23

https://doi.org/10.1145/3490686

Published: 04 March 2022 Publication History

Get Access

Abstract

Considerable attention has been paid to physiological signal-based emotion recognition in the field of affective computing. For reliability and user-friendly acquisition, electrodermal activity (EDA) has a great advantage in practical applications. However, EDA-based emotion recognition with large-scale subjects is still a tough problem. The traditional well-designed classifiers with hand-crafted features produce poorer results because of their limited representation abilities. And the deep learning models with auto feature extraction suffer the overfitting drop-off because of large-scale individual differences. Since music has a strong correlation with human emotion, static music can be involved as the external benchmark to constrain various dynamic EDA signals. In this article, we make an attempt by fusing the subject’s individual EDA features and the external evoked music features. And we propose an end-to-end multimodal framework, the one-dimensional residual temporal and channel attention network (RTCAN-1D). For EDA features, the channel-temporal attention mechanism for EDA-based emotion recognition is first involved in mine the temporal and channel-wise dynamic and steady features. The comparisons with single EDA-based SOTA models on DEAP and AMIGOS datasets prove the effectiveness of RTCAN-1D to mine EDA features. For music features, we simply process the music signal with the open-source toolkit openSMILE to obtain external feature vectors. We conducted systematic and extensive evaluations. The experiments on the current largest music emotion dataset PMEmo validate that the fusion of EDA and music is a reliable and efficient solution for large-scale emotion recognition.

References

[1]

Fadi Al Machot, Ali Elmachot, Mouhannad Ali, Elyan Al Machot, and Kyandoghere Kyamakya. 2019. A deep-learning model for subject-independent human emotion recognition using electrodermal activity sensors. Sensors 19, 7 (2019), 1659.

Abstract

References

Cited By

Index Terms

Recommendations

Emotion Recognition in Conversations Using Brain and Physiological Signals

Fine-grained emotion recognition: fusion of physiological signals and facial expressions on spontaneous emotion corpus

Music Emotion Recognition: From Content- to Context-Based Models

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Get Access

Login options

Full Access

View options

PDF

eReader

Full Text

HTML Format

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations