Loading…

Document-level denoising relation extraction with false-negative mining and reinforced positive-class knowledge distillation

Many datasets for document-level relation extraction (RE) suffer from incomplete labeling, particularly the false negative problem, which induces improper biases during training. However, existing denoising methods are either limited to the scale of dataset or primarily focus on sentence-level RE. T...

Full description

Saved in:
Bibliographic Details
Published in:Information processing & management 2024-01, Vol.61 (1), p.103533, Article 103533
Main Authors: Zeng, Daojian, Zhu, Jianling, Chen, Hongting, Dai, Jianhua, Jiang, Lincheng
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Many datasets for document-level relation extraction (RE) suffer from incomplete labeling, particularly the false negative problem, which induces improper biases during training. However, existing denoising methods are either limited to the scale of dataset or primarily focus on sentence-level RE. To tackle this prevalent issue of false negatives, we propose a denoising framework called FM-RKD for document-level RE. Firstly, a false-negative mining mechanism is introduced to identify and re-annotate false negative samples (FNs) within the original corpus, thereby producing a higher-quality pseudo corpus. Then, we propose a reinforced positive-class knowledge distillation method, where a teacher network trained with positive samples provides soft labels for a student network. This approach enables the student network to learn complete positive-class patterns and mitigate the overfitting issue caused by FNs. Extensive experiments on the Re-DocRED dataset show that FM-RKD outperforms the current state-of-the-art method by 1.36% in F1 score and 1.24% in Ign F1 score when the training data is incompletely annotated. Moreover, FM-RKD consistently achieves new peak performance with an F1 score of 78.38% even when the training data is well-annotated. •Incomplete labeling data induces improper biases during training.•Creating a high-quality pseudo corpus by false-negative mining mechanism.•Capturing the complete positive-class patterns with knowledge distillation.•Training the teacher network innovatively using only positive samples.•Proposing a denoising framework to prevent overfitting of false negative samples.
ISSN:0306-4573
1873-5371
DOI:10.1016/j.ipm.2023.103533