Loading…
Dual stage probabilistic voice activity detector
Voice activity detectors (VADs) are critical part of every speech enhancement and speech processing system. One of the major problems in practical realizations is to achieve robust VAD in conditions of background noise. Most of the statistical model-based approaches employ the Gaussian assumption in...
Saved in:
Published in: | The Journal of the Acoustical Society of America 2010-03, Vol.127 (3_Supplement), p.1816-1816 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Voice activity detectors (VADs) are critical part of every speech enhancement and speech processing system. One of the major problems in practical realizations is to achieve robust VAD in conditions of background noise. Most of the statistical model-based approaches employ the Gaussian assumption in the discrete Fourier transform domain, which deviates from the real observation. In this paper, we propose a class of VAD algorithms based on several statistical models of the probability density functions of the magnitudes. In addition, we evaluate several approaches for time smoothing the magnitude response to achieve a more robust estimate. A large data corpus of in-car noise conditions is then used to optimize the parameters of the VAD, and the results are discussed. |
---|---|
ISSN: | 0001-4966 1520-8524 |
DOI: | 10.1121/1.3384189 |