Loading…
Self-supervised Regularization for Text Classification
Text classification is a widely studied problem and has broad applications. In many real-world problems, the number of texts for training classification models is limited, which renders these models prone to overfitting. To address this problem, we propose SSL-Reg, a data-dependent regularization ap...
Saved in:
Published in: | Transactions of the Association for Computational Linguistics 2021-01, Vol.9, p.641-656 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Text classification is a widely studied problem and has broad applications. In
many real-world problems, the number of texts for training classification models
is limited, which renders these models prone to overfitting. To address this
problem, we propose SSL-Reg, a data-dependent regularization approach based on
self-supervised learning (SSL). SSL (Devlin et al.,
) is an unsupervised learning approach that
defines auxiliary tasks on input data without using any human-provided labels
and learns data representations by solving these auxiliary tasks. In SSL-Reg, a
supervised classification task and an unsupervised SSL task are performed
simultaneously. The SSL task is unsupervised, which is defined purely on input
texts without using any human- provided labels. Training a model using an SSL
task can prevent the model from being overfitted to a limited number of class
labels in the classification task. Experiments on 17 text classification
datasets demonstrate the effectiveness of our proposed method. Code is available
at
. |
---|---|
ISSN: | 2307-387X 2307-387X |
DOI: | 10.1162/tacl_a_00389 |