Loading…
Self-Training With Double Selectors for Low-Resource Named Entity Recognition
Named Entity Recognition (NER) is fundamental to multiple downstream natural language processing (NLP) tasks, but most advanced NER methods heavily rely on massive labeled data with high cost. In this paper, we explore the effectiveness of self-training for low-resource NER. It is one of the semi-su...
Saved in:
Published in: | IEEE/ACM transactions on audio, speech, and language processing speech, and language processing, 2023-01, Vol.31, p.1-11 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Named Entity Recognition (NER) is fundamental to multiple downstream natural language processing (NLP) tasks, but most advanced NER methods heavily rely on massive labeled data with high cost. In this paper, we explore the effectiveness of self-training for low-resource NER. It is one of the semi-supervised approaches to reduce the reliance on manual annotation. However, random pseudo sample selection in standard self-training framework may cause serious error propagation, especially for token-level tasks. To that end, this paper focuses on pseudo sample selection and proposes a new self-training framework with double selectors, namely auxiliary judge task and entropy-based confidence measurement. Specifically, the auxiliary judge task is proposed to filter out the pseudo samples with wrong predictions. The entropy-based confidence measurement is introduced to select pseudo samples with high quality. In addition, to make full use of all pseudo samples, we propose a cumulative function based on the idea of curriculum learning to prompt the model to learn from easy samples to hard ones. Samples with low quality are filtered out through the double selectors, which is more conducive to the training of student models. Experimental results on five NER benchmark datasets from different languages indicate the effectiveness of the proposed framework over several advanced baselines. |
---|---|
ISSN: | 2329-9290 2329-9304 |
DOI: | 10.1109/TASLP.2023.3250828 |