Loading…

Transductive nonnegative matrix factorization for semi-supervised high-performance speech separation

Regarding the non-negativity property of the magnitude spectrogram of speech signals, nonnegative matrix factorization (NMF) has obtained promising performance for speech separation by independently learning a dictionary on the speech signals of each known speaker. However, traditional NM-F fails to...

Full description

Saved in:

Bibliographic Details
Main Authors:	Naiyang Guan, Long Lan, Dacheng Tao, Zhigang Luo, Xuejun Yang
Format:	Conference Proceeding
Language:	English
Subjects:	Dictionaries Nonnegative matrix factorization Silicon Spectrogram Speech Speech processing speech separation Time-domain analysis Training transductive learning
Online Access:	Request full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Regarding the non-negativity property of the magnitude spectrogram of speech signals, nonnegative matrix factorization (NMF) has obtained promising performance for speech separation by independently learning a dictionary on the speech signals of each known speaker. However, traditional NM-F fails to represent the mixture signals accurately because the dictionaries for speakers are learned in the absence of mixture signals. In this paper, we propose a new transductive NMF algorithm (TNMF) to jointly learn a dictionary on both speech signals of each speaker and the mixture signals to be separated. Since TNMF learns a more descriptive dictionary by encoding the mixture signals than that learned by NMF, it significantly boosts the separation performance. Experiments results on a popular TIMIT dataset show that the proposed TNMF-based methods outperform traditional NMF-based methods for separating the monophonic mixtures of speech signals of known speakers.
ISSN:	1520-6149 2379-190X
DOI:	10.1109/ICASSP.2014.6854057