Loading…
Stable Contrastive Learning for Self-Supervised Sentence Embeddings With Pseudo-Siamese Mutual Learning
Learning semantic sentence embeddings is beneficial to a variety of natural language processing tasks. Recently, methods using the contrastive learning framework to fine-tune pre-trained language models have been proposed and have achieved significant performance on sentence embeddings. However, sen...
Saved in:
Published in: | IEEE/ACM transactions on audio, speech, and language processing speech, and language processing, 2022, Vol.30, p.3046-3059 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Learning semantic sentence embeddings is beneficial to a variety of natural language processing tasks. Recently, methods using the contrastive learning framework to fine-tune pre-trained language models have been proposed and have achieved significant performance on sentence embeddings. However, sentence embeddings are easy to "overfit" to the contrastive learning goal. With the training of contrastive learning, the gap between contrastive learning and test tasks leads to unstable even declining performance on test tasks. For this reason, existing methods rely on the labeled development set to frequently evaluate the performance on test tasks and get the best checkpoints. In such a way, models are limited when the labeled data is unavailable or extremely scarce. To address this problem, we propose P seudo- S iamese network M utual L earning (PSML) for self-supervised sentence embeddings to reduce the gap between contrastive learning and test tasks. Consisting of the main encoder and the auxiliary encoder, PSML utilizes mutual learning as the basic framework. Between the two encoders, two mutual learning losses are constructed to share learning signals. The proposed model framework and losses of PSML help the model be optimized more stably and generalize better to test tasks, such as semantic textual similarity. Extensive experiments on seven public semantic textual similarity datasets show that PSML performs better than previous unsupervised contrastive methods for sentence embeddings. Besides, PSML also gives a stable performance curve on test tasks with training and is able to get the comparative performance without frequent evaluation on the labeled development set. |
---|---|
ISSN: | 2329-9290 2329-9304 |
DOI: | 10.1109/TASLP.2022.3203209 |