Loading…

Curricular Contrastive Regularization for Speech Enhancement with Self-Supervised Representations

Existing deep learning-based speech enhancement methods only adopt clean speech as positive samples to guide the training of speech enhancement networks while negative samples, i.e., noisy speech, are unexploited. In this paper, we adopt contrastive regularization (CR) built upon contrastive learnin...

Full description

Saved in:
Bibliographic Details
Main Authors: Xu, Xinmeng, Han, Chang, Zhang, Yiqun, Tu, Weiping, Yang, Yuhong
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Existing deep learning-based speech enhancement methods only adopt clean speech as positive samples to guide the training of speech enhancement networks while negative samples, i.e., noisy speech, are unexploited. In this paper, we adopt contrastive regularization (CR) built upon contrastive learning to exploit both the information of noisy and clean speech as negative and positive samples, respectively. Particularly, CR minimizes the distance between clean and enhanced speech and maximizes the distance between noisy and enhanced speech in the representation space of the self-supervised learning model. However, the contrastive samples are non-consensual, as the negatives are usually represented distantly from the clean speech, leaving the solution space still under-constricted. To tackle this issue, we provide the negative samples assembled from (1) the noisy speech, and (2) the corresponding enhanced speech without using CR, and we customize a curriculum learning strategy to define the importance of these negative samples to balance the learning difficulty caused by different similarities between the embeddings of the positive and negative samples. Experiments show that our proposal improves SE performance effectively without introducing additional computation/parameters.
ISSN:2379-190X
DOI:10.1109/ICASSP48485.2024.10445912