Loading…

Maximum negentropy beamforming using complex generalized Gaussian distribution model

This paper presents a new beamforming method for distant speech recognition. In contrast to conventional beamforming techniques, our beamformer adjusts the active weight vectors so as to make the distribution of beamformer's outputs as super-Gaussian as possible. That is achieved by maximizing...

Full description

Saved in:
Bibliographic Details
Main Authors: Kumatani, K, Rauch, B, McDonough, J, Klakow, D
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper presents a new beamforming method for distant speech recognition. In contrast to conventional beamforming techniques, our beamformer adjusts the active weight vectors so as to make the distribution of beamformer's outputs as super-Gaussian as possible. That is achieved by maximizing negentropy of the outputs. In our previous work, the generalized Gaussian probability density function (GG-PDF) for real-valued random variables (RVs) was used for modeling magnitude of a speech signal and a subband component was not directly modeled. Accordingly, it could not represent the distribution of the subband signal faithfully. In this work, we use the GG-PDF for complex RVs in order to model subband components directly. The appropriate amount of data for adapting the active weight vector is also studied. The performance of the beamforming techniques is investigated through a series of automatic speech recognition experiments on the Multi-Channel Wall Street Journal Audio Visual Corpus (MC-WSJ-AV). The data was recorded with real sensors in a real meeting room, and hence contains noise from computers, fans, and other apparatus in the room. The test data is neither artificially convolved with measured impulse responses nor unrealistically mixed with separately recorded noise.
ISSN:1058-6393
2576-2303
DOI:10.1109/ACSSC.2010.5757769