Loading…

Stable Contrastive Learning for Self-Supervised Sentence Embeddings With Pseudo-Siamese Mutual Learning

Learning semantic sentence embeddings is beneficial to a variety of natural language processing tasks. Recently, methods using the contrastive learning framework to fine-tune pre-trained language models have been proposed and have achieved significant performance on sentence embeddings. However, sen...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE/ACM transactions on audio, speech, and language processing speech, and language processing, 2022, Vol.30, p.3046-3059
Main Authors:	Xie, Yutao, Wu, Qiyu, Chen, Wei, Wang, Tengjiao
Format:	Article
Language:	English
Subjects:	Bit error rate Coders contrastive learning Data models Learning mutual learning Natural language processing Performance evaluation Semantics Sentence embedding Similarity Speech processing Task analysis Training
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Learning semantic sentence embeddings is beneficial to a variety of natural language processing tasks. Recently, methods using the contrastive learning framework to fine-tune pre-trained language models have been proposed and have achieved significant performance on sentence embeddings. However, sentence embeddings are easy to "overfit" to the contrastive learning goal. With the training of contrastive learning, the gap between contrastive learning and test tasks leads to unstable even declining performance on test tasks. For this reason, existing methods rely on the labeled development set to frequently evaluate the performance on test tasks and get the best checkpoints. In such a way, models are limited when the labeled data is unavailable or extremely scarce. To address this problem, we propose P seudo- S iamese network M utual L earning (PSML) for self-supervised sentence embeddings to reduce the gap between contrastive learning and test tasks. Consisting of the main encoder and the auxiliary encoder, PSML utilizes mutual learning as the basic framework. Between the two encoders, two mutual learning losses are constructed to share learning signals. The proposed model framework and losses of PSML help the model be optimized more stably and generalize better to test tasks, such as semantic textual similarity. Extensive experiments on seven public semantic textual similarity datasets show that PSML performs better than previous unsupervised contrastive methods for sentence embeddings. Besides, PSML also gives a stable performance curve on test tasks with training and is able to get the comparative performance without frequent evaluation on the labeled development set.
ISSN:	2329-9290 2329-9304
DOI:	10.1109/TASLP.2022.3203209