Loading…

Reliable voice activity detection algorithms under adverse environments

In this paper, two robust voice activity detection (VAD) algorithms are proposed for harsh environments. The first algorithm is based on supervised neural network (NN) using the Levenberg-Marquardt algorithm. A feedforward NN with two layers operates on input features which are the mel-frequency cep...

Full description

Saved in:
Bibliographic Details
Main Authors: Stadtschnitzer, M., Van Pham, T., Tang Tan Chien
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this paper, two robust voice activity detection (VAD) algorithms are proposed for harsh environments. The first algorithm is based on supervised neural network (NN) using the Levenberg-Marquardt algorithm. A feedforward NN with two layers operates on input features which are the mel-frequency cepstral coefficients extracted from noisy speech frames. The second algorithm is a threshold-based method that employs only single subband power distance feature calculated from wavelet coefficients at different wavelet subbands. A statistical percentile filtering technique based on long-term information is improved to estimate adaptive noise threshold more accurately. The proposed algorithms are tested with the TIMIT database which was artificially distorted by different additive noise types, and are compared with state-of-the-art VAD methods. The results show that they are very robust to different types of noise and mostly outperform the standard VADs such as the ETSI AFE ES 202 050 and ITU-T G.729 B.
DOI:10.1109/CCE.2008.4578961