Loading…

Comparing Recognition Performance and Robustness of Multimodal Deep Learning Models for Multimodal Emotion Recognition

Multimodal signals are powerful for emotion recognition since they can represent emotions comprehensively. In this article, we compare the recognition performance and robustness of two multimodal emotion recognition models: 1) deep canonical correlation analysis (DCCA) and 2) bimodal deep autoencode...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on cognitive and developmental systems 2022-06, Vol.14 (2), p.715-729
Main Authors:	Liu, Wei, Qiu, Jie-Lin, Zheng, Wei-Long, Lu, Bao-Liang
Format:	Article
Language:	English
Subjects:	Bimodal deep autoencoder (BDAE) Brain modeling Computational modeling Correlation Correlation analysis Datasets deep canonical correlation analysis (DCCA) Deep learning Electroencephalography electroencephalography (EEG) Emotion recognition Emotions eye movement Machine learning multimodal deep learning multimodal emotion recognition Robustness
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Multimodal signals are powerful for emotion recognition since they can represent emotions comprehensively. In this article, we compare the recognition performance and robustness of two multimodal emotion recognition models: 1) deep canonical correlation analysis (DCCA) and 2) bimodal deep autoencoder (BDAE). The contributions of this article are threefold: 1) we propose two methods for extending the original DCCA model for multimodal fusion: a) weighted sum fusion and b) attention-based fusion; 2) we systemically compare the performance of DCCA, BDAE, and traditional approaches on five multimodal data sets; and 3) we investigate the robustness of DCCA, BDAE, and traditional approaches on SEED-V and DREAMER data sets under two conditions: 1) adding noises to multimodal features and 2) replacing electroencephalography features with noises. Our experimental results demonstrate that DCCA achieves state-of-the-art recognition results on all five data sets: 1) 94.6% on the SEED data set; 2) 87.5% on the SEED-IV data set; 3) 84.3% and 85.6% on the DEAP data set; 4) 85.3% on the SEED-V data set; and 5) 89.0%, 90.6%, and 90.7% on the DREAMER data set. Meanwhile, DCCA has greater robustness when adding various amounts of noises to the SEED-V and DREAMER data sets. By visualizing features before and after DCCA transformation on the SEED-V data set, we find that the transformed features are more homogeneous and discriminative across emotions.
ISSN:	2379-8920 2379-8939
DOI:	10.1109/TCDS.2021.3071170