Loading…

Linear demixed domain multichannel nonnegative matrix factorization for speech enhancement

In this paper, we investigate blind source separation for audio signals based on multichannel nonnegative matrix factorization (MNMF) of magnitude spectrograms in a linear demixed domain. The original magnitude MNMF by itself is less effective in general acoustic situations because it discards mutua...

Full description

Saved in:
Bibliographic Details
Main Authors: Taniguchi, Toru, Masuda, Taro
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this paper, we investigate blind source separation for audio signals based on multichannel nonnegative matrix factorization (MNMF) of magnitude spectrograms in a linear demixed domain. The original magnitude MNMF by itself is less effective in general acoustic situations because it discards mutual information between input channels, which is represented by non-diagonal complex elements of the spatial covariance matrices of them. To deal with this problem, several linear transformations of the multichannel input have been proposed in order to diagonalize the covariance matrices without loss of the mutual information. However, when the number of microphones is small, it is difficult for static transformations to work well for various combinations of source positions. For this problem, we first prove that general linear transformations (linear demixing) can be applied as preprocessing of the magnitude MNMF, and then confirm that a transformation adaptive to source positions, such as using frequency domain independent component analysis, is better than the conventional static transformation by experimental comparison of 2- and 4-channel noisy speech enhancement tasks.
ISSN:2379-190X
DOI:10.1109/ICASSP.2017.7952201