Loading…

Self-Training With Double Selectors for Low-Resource Named Entity Recognition

Named Entity Recognition (NER) is fundamental to multiple downstream natural language processing (NLP) tasks, but most advanced NER methods heavily rely on massive labeled data with high cost. In this paper, we explore the effectiveness of self-training for low-resource NER. It is one of the semi-su...

Full description

Saved in:
Bibliographic Details
Published in:IEEE/ACM transactions on audio, speech, and language processing speech, and language processing, 2023-01, Vol.31, p.1-11
Main Authors: Fu, Yingwen, Lin, Nankai, Yu, Xiaohui, Jiang, Shengyi
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Named Entity Recognition (NER) is fundamental to multiple downstream natural language processing (NLP) tasks, but most advanced NER methods heavily rely on massive labeled data with high cost. In this paper, we explore the effectiveness of self-training for low-resource NER. It is one of the semi-supervised approaches to reduce the reliance on manual annotation. However, random pseudo sample selection in standard self-training framework may cause serious error propagation, especially for token-level tasks. To that end, this paper focuses on pseudo sample selection and proposes a new self-training framework with double selectors, namely auxiliary judge task and entropy-based confidence measurement. Specifically, the auxiliary judge task is proposed to filter out the pseudo samples with wrong predictions. The entropy-based confidence measurement is introduced to select pseudo samples with high quality. In addition, to make full use of all pseudo samples, we propose a cumulative function based on the idea of curriculum learning to prompt the model to learn from easy samples to hard ones. Samples with low quality are filtered out through the double selectors, which is more conducive to the training of student models. Experimental results on five NER benchmark datasets from different languages indicate the effectiveness of the proposed framework over several advanced baselines.
ISSN:2329-9290
2329-9304
DOI:10.1109/TASLP.2023.3250828