Loading…

Reducing the Length Divergence Bias for Textual Matching Models via Alternating Adversarial Training

Although deep learning has made remarkable achievements in natural language processing tasks, many researchers have recently indicated that models achieve high performance by exploiting statistical bias in datasets. However, once such models obtained on statistically biased datasets are applied in s...

Full description

Saved in:
Bibliographic Details
Main Authors: Zheng, Lantao, Kuang, Wenxin, Liang, Qizhuang, Liang, Wei, Hu, Qiao, Fu, Wei, Ding, Xiashu, Xu, Bijiang, Hu, Yupeng
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Although deep learning has made remarkable achievements in natural language processing tasks, many researchers have recently indicated that models achieve high performance by exploiting statistical bias in datasets. However, once such models obtained on statistically biased datasets are applied in scenarios where statistical bias does not exist, they show a significant decrease in accuracy. In this work, we focus on the length divergence bias, which makes language models tend to classify samples with high length divergence as negative and vice versa. We propose a solution to make the model pay more attention to semantics and not be affected by bias. First, we propose constructing an adversarial test set to magnify the effect of bias on models. Then, we introduce some novel techniques to demote length divergence bias. Finally, we conduct our experiments on two textual matching corpora, and the results show that our approach effectively improves the generalization and robustness of the model, although the degree of bias of the two corpora is not the same.
ISSN:2693-8928
DOI:10.1109/CSCloud-EdgeCom58631.2023.00040