Loading…
Exploiting similarities across multiple dimensions for author name disambiguation
In bibliometric analysis, ambiguity in author names may lead to erroneous aggregation of records. The author name disambiguation techniques attempt to address this issue by attributing records to the corresponding author. The name disambiguation has been widely studied as a clustering task. However,...
Saved in:
Published in: | Scientometrics 2021-09, Vol.126 (9), p.7525-7560 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In bibliometric analysis, ambiguity in author names may lead to erroneous aggregation of records. The
author name disambiguation
techniques attempt to address this issue by attributing records to the corresponding author. The name disambiguation has been widely studied as a clustering task. However, maintaining consistent accuracy levels over datasets is still a major challenge. Recent efforts have witnessed the use of
representation learning
based techniques to map the records to an embedding space that can be used to determine the clusters. However, some of these models that use supervised global embedding fail to generalize across different datasets, while others lag in the accuracy. In this paper, we propose a method that uses two independent relations among the documents-
co-authorship
and
meta-content
of document, to generate a latent representation of documents that is capable of generalizing over various datasets (consisting different sets of features). Through rigorous validation, we discover that the proposed approach outperforms several state-of-the-art methods by a significant margin in terms of standard measures like pairwise F1, K metric, and BF1 scores. Moreover, we have also validated the performance of our method with the statistical test. |
---|---|
ISSN: | 0138-9130 1588-2861 |
DOI: | 10.1007/s11192-021-04101-y |