Loading…

Measuring Similarity of Dual-Modal Academic Data Based on Multi-Fusion Representation Learning

Nowadays, academic materials such as articles, patents, lecture notes, and observation records often use both texts and images (i.e., dual-modal data) to illustrate scientific issues. Measuring the similarity of such dual-modal academic data largely depends on dual-modal features, which is far from...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE access 2024, Vol.12, p.97701-97711
Main Authors:	Zhang, Li, Gao, Qiang, Liu, Ming, Gu, Zepeng, Lang, Bo
Format:	Article
Language:	English
Subjects:	Big Data Correlation Data models Deep learning dual-modal academic data Education Educational programs multi fusion Representation learning Scholarly big data Semantics Streaming media Transformers Visualization
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Nowadays, academic materials such as articles, patents, lecture notes, and observation records often use both texts and images (i.e., dual-modal data) to illustrate scientific issues. Measuring the similarity of such dual-modal academic data largely depends on dual-modal features, which is far from satisfying in practice. To learn dual-modal feature representation, most current approaches mine interactions between texts and images on top of their fusion networks. This work proposes a multi-fusion deep learning framework that learns semantically richer dual-modal representations. The framework designs multiple fusion points in the feature space of various levels, and gradually integrates the fusion information from the low-level to the high-level. In addition, we develop a multi-channel decoding network with alternate fine-tuning strategies to mine modal-specific features and cross-modal correlations thoroughly. To our knowledge, this is the first work to bring forward deep learning functions for dual-modal academic data. It reduces the semantic and statistical attribute differences between two modalities, thereby learning robust representations. A large number of experiments conducted on real-world data sets show that our method has significant performance compared with state-of-the-art approaches.
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2024.3427731