Loading…
A novel data-free continual learning method with contrastive reversion
While continual learning has shown its impressive performance in addressing catastrophic forgetting of traditional neural networks and enabling them to learn multiple tasks continuously, it still requires a large amount of input data to train neural networks with satisfactory classification performa...
Saved in:
Published in: | International journal of machine learning and cybernetics 2024-02, Vol.15 (2), p.505-518 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | While continual learning has shown its impressive performance in addressing catastrophic forgetting of traditional neural networks and enabling them to learn multiple tasks continuously, it still requires a large amount of input data to train neural networks with satisfactory classification performance. Since collecting a large amount of training data is a time-consuming and expensive procedure, this study attempts to propose a novel
d
ata-
f
ree
c
ontrastive
r
eversion method for
c
ontinual
l
earning (DFCRCL) to significantly reduce the number of training data for continual learning, while maintaining or even improving the classification performance of continual learning. In order to achieve such a goal, DFCRCL uses contrastive reversion to generate high-semantic pseudo samples from the previous task to guide the training of the current task. DFCRCL has three merits: (1) knowledge distillation from the previous task model to the current task model guarantees both the reduction of training data and the avoidance of catastrophic forgetting, and thus DFCRCL can effectively learn a sequence of tasks continuously (2) contrastive reversion enhances the semantic diversity of pseudo samples by learning the distinguishability between distinct pseudo samples in the feature space (3) contrastive reversion improves the performance of knowledge distillation in DFCRCL by enhancing the semantic diversity of the pseudo samples generated from the previous task model. Compared to six mainstream continual learning methods, the proposed DFCRCL achieves at least comparable or even better classification performance and stability in four benchmarking continual learning scenarios. In addition, the effectiveness of DFCRCL is demonstrated by ablation experiments. |
---|---|
ISSN: | 1868-8071 1868-808X |
DOI: | 10.1007/s13042-023-01922-6 |