Loading…
Measuring Similarity of Dual-Modal Academic Data Based on Multi-Fusion Representation Learning
Nowadays, academic materials such as articles, patents, lecture notes, and observation records often use both texts and images (i.e., dual-modal data) to illustrate scientific issues. Measuring the similarity of such dual-modal academic data largely depends on dual-modal features, which is far from...
Saved in:
Published in: | IEEE access 2024, Vol.12, p.97701-97711 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Nowadays, academic materials such as articles, patents, lecture notes, and observation records often use both texts and images (i.e., dual-modal data) to illustrate scientific issues. Measuring the similarity of such dual-modal academic data largely depends on dual-modal features, which is far from satisfying in practice. To learn dual-modal feature representation, most current approaches mine interactions between texts and images on top of their fusion networks. This work proposes a multi-fusion deep learning framework that learns semantically richer dual-modal representations. The framework designs multiple fusion points in the feature space of various levels, and gradually integrates the fusion information from the low-level to the high-level. In addition, we develop a multi-channel decoding network with alternate fine-tuning strategies to mine modal-specific features and cross-modal correlations thoroughly. To our knowledge, this is the first work to bring forward deep learning functions for dual-modal academic data. It reduces the semantic and statistical attribute differences between two modalities, thereby learning robust representations. A large number of experiments conducted on real-world data sets show that our method has significant performance compared with state-of-the-art approaches. |
---|---|
ISSN: | 2169-3536 2169-3536 |
DOI: | 10.1109/ACCESS.2024.3427731 |