Loading…
Multilingual Autoregressive Entity Linking
We present mGENRE, a sequence-to- sequence system for the Multilingual Entity Linking (MEL) problem—the task of resolving language-specific mentions to a multilingual Knowledge Base (KB). For a mention in a given language, mGENRE predicts the name of the target entity left-to-right, token-by-token i...
Saved in:
Published in: | Transactions of the Association for Computational Linguistics 2022-03, Vol.10, p.274-290 |
---|---|
Main Authors: | , , , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | We present mGENRE, a sequence-to- sequence system for the Multilingual Entity
Linking (MEL) problem—the task of resolving language-specific mentions to
a multilingual Knowledge Base (KB). For a mention in a given language, mGENRE
predicts the name of the target entity left-to-right, token-by-token in an
autoregressive fashion. The autoregressive formulation allows us to effectively
cross-encode mention string and entity names to capture more interactions than
the standard dot product between mention and entity vectors. It also enables
fast search within a large KB even for mentions that do not appear in mention
tables and with no need for large-scale vector indices. While prior MEL works
use a single representation for each entity, we match against entity names of as
many languages as possible, which allows exploiting language connections between
source input and target name. Moreover, in a zero-shot setting on languages with
no training data at all, mGENRE treats the target language as a latent variable
that is marginalized at prediction time. This leads to over 50%
improvements in average accuracy. We show the efficacy of our approach through
extensive evaluation including experiments on three popular MEL benchmarks where
we establish new state-of-the-art results. Source code available at
. |
---|---|
ISSN: | 2307-387X 2307-387X |
DOI: | 10.1162/tacl_a_00460 |