Loading…
A cross-domain transfer learning model for author name disambiguation on heterogeneous graph with pretrained language model
Author names in scientific literature are often ambiguous, complicating the accurate retrieval of academic information. Furthermore, many author names are shared by multiple scholars, making it challenging to construct academic search engine knowledge bases. These issues highlight the need for effec...
Saved in:
Published in: | Knowledge-based systems 2024-12, Vol.305, p.112624, Article 112624 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Author names in scientific literature are often ambiguous, complicating the accurate retrieval of academic information. Furthermore, many author names are shared by multiple scholars, making it challenging to construct academic search engine knowledge bases. These issues highlight the need for effective author name disambiguation. Existing methods have limitations in handling text content and heterogeneous graph node representations and often require extensive annotated training data. This study introduces an academic heterogeneous graph embedding neural network, HGNN-S, which leverages a pretrained semantic language model to integrate semantic information from texts, heterogeneous attribute relationships, and heterogeneous neighbor data. Trained on a small amount of single-domain annotated data, HGNN-S can disambiguate names across multiple domains. Experimental results demonstrate that our model outperforms current state-of-the-art methods and enhances search performance on the China National Platform, Kejso.
[Display omitted]
•A heterogeneous graph embedding NN based on the pre-trained semantic language model.•Our model considers deep level text semantic information and relational information.•The model performs well in transfer learning.•The model performs well in encoding nodes across multiple tasks. |
---|---|
ISSN: | 0950-7051 |
DOI: | 10.1016/j.knosys.2024.112624 |