Loading…

Dual stage probabilistic voice activity detector

Voice activity detectors (VADs) are critical part of every speech enhancement and speech processing system. One of the major problems in practical realizations is to achieve robust VAD in conditions of background noise. Most of the statistical model-based approaches employ the Gaussian assumption in...

Full description

Saved in:
Bibliographic Details
Published in:The Journal of the Acoustical Society of America 2010-03, Vol.127 (3_Supplement), p.1816-1816
Main Authors: Tashev, Ivan, Lovitt, Andrew, Acero, Alex
Format: Article
Language:English
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Voice activity detectors (VADs) are critical part of every speech enhancement and speech processing system. One of the major problems in practical realizations is to achieve robust VAD in conditions of background noise. Most of the statistical model-based approaches employ the Gaussian assumption in the discrete Fourier transform domain, which deviates from the real observation. In this paper, we propose a class of VAD algorithms based on several statistical models of the probability density functions of the magnitudes. In addition, we evaluate several approaches for time smoothing the magnitude response to achieve a more robust estimate. A large data corpus of in-car noise conditions is then used to optimize the parameters of the VAD, and the results are discussed.
ISSN:0001-4966
1520-8524
DOI:10.1121/1.3384189