Loading…
MST-ARGCN: modality-squeeze transformer with attentional recurrent graph capsule network for multimodal sentiment analysis
Multimodal sentiment analysis (MSA) is a heat topic in the deep learning research. Despite the progress in previous studies, existing methods based on multimodal transformer (MulT) tend to be extremely expensive since they often neglect to simplify the complicated computational procedures and does n...
Saved in:
Published in: | The Journal of supercomputing 2025, Vol.81 (1), Article 86 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Multimodal sentiment analysis (MSA) is a heat topic in the deep learning research. Despite the progress in previous studies, existing methods based on multimodal transformer (MulT) tend to be extremely expensive since they often neglect to simplify the complicated computational procedures and does not take the nonequilibrium contributions of each modality into account, which completely adopt six crossmodal transformers to extract the representation sequences. In addition, these methods cannot effectively manage long-range dependency relationships, resulting in poor performance. Aiming at these problems, we propose a modality-squeeze transformer with attentional recurrent graph capsule network (MST-ARGCN) for MSA. It first squeezes three modalities through low-rank fusion to obtain the multimodal fused vector. After that, it utilizes only one crossmodal transformer, setting the multimodal fused vector as the source modality and the text as the target modality, to extract the representation sequence for subsequent networks, which greatly reduced the number of network parameters. In addition, ARGCN is presented to enhance the capability for learning the long-range dependency relationship during the outer-loop graph aggregation stage for further performance improving. We evaluate our model on the CMU-MOSEI and CMU-MOSI datasets. The experiment result proves that our model can achieve a competitive performance with low computational complexity. |
---|---|
ISSN: | 0920-8542 1573-0484 |
DOI: | 10.1007/s11227-024-06588-7 |