Loading…
Linear demixed domain multichannel nonnegative matrix factorization for speech enhancement
In this paper, we investigate blind source separation for audio signals based on multichannel nonnegative matrix factorization (MNMF) of magnitude spectrograms in a linear demixed domain. The original magnitude MNMF by itself is less effective in general acoustic situations because it discards mutua...
Saved in:
Main Authors: | , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In this paper, we investigate blind source separation for audio signals based on multichannel nonnegative matrix factorization (MNMF) of magnitude spectrograms in a linear demixed domain. The original magnitude MNMF by itself is less effective in general acoustic situations because it discards mutual information between input channels, which is represented by non-diagonal complex elements of the spatial covariance matrices of them. To deal with this problem, several linear transformations of the multichannel input have been proposed in order to diagonalize the covariance matrices without loss of the mutual information. However, when the number of microphones is small, it is difficult for static transformations to work well for various combinations of source positions. For this problem, we first prove that general linear transformations (linear demixing) can be applied as preprocessing of the magnitude MNMF, and then confirm that a transformation adaptive to source positions, such as using frequency domain independent component analysis, is better than the conventional static transformation by experimental comparison of 2- and 4-channel noisy speech enhancement tasks. |
---|---|
ISSN: | 2379-190X |
DOI: | 10.1109/ICASSP.2017.7952201 |