Loading…

Position-aware self-attention based neural sequence labeling

•This paper identifies the problem of modeling discrete context dependencies in sequence labeling tasks.•This paper develops a well-designed self-attentional context fusion network to provide complementary context information on the basis of Bi-LSTM.•This paper proposes a novel position-aware self-a...

Full description

Saved in:
Bibliographic Details
Published in:Pattern recognition 2021-02, Vol.110, p.107636, Article 107636
Main Authors: Wei, Wei, Wang, Zanbo, Mao, Xianling, Zhou, Guangyou, Zhou, Pan, Jiang, Sheng
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•This paper identifies the problem of modeling discrete context dependencies in sequence labeling tasks.•This paper develops a well-designed self-attentional context fusion network to provide complementary context information on the basis of Bi-LSTM.•This paper proposes a novel position-aware self-attention to incorporate three different positional factors for exploring the relative position information among token.•The proposed model achieves state-of-the-arts performance on part-of-speech (POS) tagging, named entity recognition (NER) and phrase chunking tasks. Sequence labeling is a fundamental task in natural language processing and has been widely studied. Recently, RNN-based sequence labeling models have increasingly gained attentions. Despite superior performance achieved by learning the long short-term (i.e., successive) dependencies, the way of sequentially processing inputs might limit the ability to capture the non-continuous relations over tokens within a sentence. To tackle the problem, we focus on how to effectively model successive and discrete dependencies of each token for enhancing the sequence labeling performance. Specifically, we propose an innovative attention-based model (called position-aware self-attention, i.e.,PSA) as well as a well-designed self-attentional context fusion layer within a neural network architecture, to explore the positional information of an input sequence for capturing the latent relations among tokens. Extensive experiments on three classical tasks in sequence labeling domain, i.e.,  part-of-speech (POS) tagging, named entity recognition (NER) and phrase chunking, demonstrate our proposed model outperforms the state-of-the-arts without any external knowledge, in terms of various metrics.
ISSN:0031-3203
1873-5142
DOI:10.1016/j.patcog.2020.107636