Loading…

Joint Ideal Ratio Mask and Generative Adversarial Networks for Monaural Speech Enhancement

Speech enhancement is the task of improving some perceptual aspects of noisy speech. Recently, Generative Adversarial Networks (GAN) is becoming a popular deep learning method and different GAN's structures have been proposed [1], [2]. In this paper, we propose a new framework for speech enhanc...

Full description

Saved in:

Bibliographic Details
Main Authors:	Yuan, Jing, Bao, Changchun
Format:	Conference Proceeding
Language:	English
Subjects:	Conferences Deep learning Generative adversarial networks Generators Nonhomogeneous media Signal processing Speech enhancement
Online Access:	Request full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Speech enhancement is the task of improving some perceptual aspects of noisy speech. Recently, Generative Adversarial Networks (GAN) is becoming a popular deep learning method and different GAN's structures have been proposed [1], [2]. In this paper, we propose a new framework for speech enhancement task by using GAN. We train two models: a generative model G and a discriminative model D. The G and D are both defined by the feedforward multilayer perceptions (MLPs) [3]. The difference between the generator and the discriminator is the generator G employs deep neural network (DNN) based on the masking technique in which the magnitude spectrum of noise and the magnitude spectrum of clean speech are estimated from noisy speech features simultaneously. Meanwhile, the discriminator D uses the MLPS structure to directly predict clean speech magnitude spectrum. The model D discriminates data that comes from clean speech or generated speech by G network. Moreover, in our work, G network is used to perform the speech enhancement. The objective evaluation and experimental results show that the proposed framework significantly improves the performance of traditional deep neural network (DNN) and recent GAN-based speech enhancement methods.
ISSN:	2164-5221
DOI:	10.1109/ICSP.2018.8652276