Loading…

Semantic enhanced Top-k similarity search on weighted HIN

Similarity searches on heterogeneous information networks (HINs) have attracted wide attention from both industrial and academic areas in recent years; for example, they have been used for friend detection in social networks and collaborator recommendation in coauthor networks. The structural inform...

Full description

Saved in:
Bibliographic Details
Published in:Neural computing & applications 2022-10, Vol.34 (19), p.16911-16927
Main Authors: Zhang, Yun, Yu, Minghe, Zhang, Tiancheng, Yu, Ge
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Similarity searches on heterogeneous information networks (HINs) have attracted wide attention from both industrial and academic areas in recent years; for example, they have been used for friend detection in social networks and collaborator recommendation in coauthor networks. The structural information on the HIN can be captured by multiple metapaths, and people usually utilize metapaths to design methods for similarity search. The rich semantics in HINs are not only structural information but also content stored in nodes. However, the content similarity of nodes was usually not valued in the existing methods. Although some researchers have recently considered both types of information in machine learning-based methods for similarity search, they have used structure and content information separately. To address this issue by balancing the influence of structure and content information flexibly in the process of searching, we propose a double channel convolutional neural network model for top-k similarity search, which uses path instances as model inputs and generates structure and content embeddings for nodes based on different metapaths. We design an attention mechanism to enhance the differences in metapaths for each node. Another attention mechanism is used to combine the content and structure information of nodes. Finally, an importance evaluation function is designed to improve the accuracy and make the model more explainable. The experimental results show that our search algorithm can effectively support top-k similarity search in HINs and achieve higher performance than existing approaches.
ISSN:0941-0643
1433-3058
DOI:10.1007/s00521-022-07339-6