Loading…

Dual stage probabilistic voice activity detector

Voice activity detectors (VADs) are critical part of every speech enhancement and speech processing system. One of the major problems in practical realizations is to achieve robust VAD in conditions of background noise. Most of the statistical model-based approaches employ the Gaussian assumption in...

Full description

Saved in:
Bibliographic Details
Published in:The Journal of the Acoustical Society of America 2010-03, Vol.127 (3_Supplement), p.1816-1816
Main Authors: Tashev, Ivan, Lovitt, Andrew, Acero, Alex
Format: Article
Language:English
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 1816
container_issue 3_Supplement
container_start_page 1816
container_title The Journal of the Acoustical Society of America
container_volume 127
creator Tashev, Ivan
Lovitt, Andrew
Acero, Alex
description Voice activity detectors (VADs) are critical part of every speech enhancement and speech processing system. One of the major problems in practical realizations is to achieve robust VAD in conditions of background noise. Most of the statistical model-based approaches employ the Gaussian assumption in the discrete Fourier transform domain, which deviates from the real observation. In this paper, we propose a class of VAD algorithms based on several statistical models of the probability density functions of the magnitudes. In addition, we evaluate several approaches for time smoothing the magnitude response to achieve a more robust estimate. A large data corpus of in-car noise conditions is then used to optimize the parameters of the VAD, and the results are discussed.
doi_str_mv 10.1121/1.3384189
format article
fullrecord <record><control><sourceid>crossref</sourceid><recordid>TN_cdi_crossref_primary_10_1121_1_3384189</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>10_1121_1_3384189</sourcerecordid><originalsourceid>FETCH-crossref_primary_10_1121_1_33841893</originalsourceid><addsrcrecordid>eNqVjksKwjAUAB-iYP0svEG2Llrzmrakaz94APchjalEKil5sdDbq9ALuBoGZjEAO-QZYo4HzISQBcp6BgmWOU9lmRdzSDjnmBZ1VS1hRfT8ailFnQA_vXXHKOqHZX3wjW5c5yg6wwbvjGXaRDe4OLK7jdZEHzawaHVHdjtxDfvL-Xa8piZ4omBb1Qf30mFUyNVvSaGalsQ_7QeP5jro</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Dual stage probabilistic voice activity detector</title><source>American Institute of Physics:Jisc Collections:Transitional Journals Agreement 2021-23 (Reading list)</source><creator>Tashev, Ivan ; Lovitt, Andrew ; Acero, Alex</creator><creatorcontrib>Tashev, Ivan ; Lovitt, Andrew ; Acero, Alex</creatorcontrib><description>Voice activity detectors (VADs) are critical part of every speech enhancement and speech processing system. One of the major problems in practical realizations is to achieve robust VAD in conditions of background noise. Most of the statistical model-based approaches employ the Gaussian assumption in the discrete Fourier transform domain, which deviates from the real observation. In this paper, we propose a class of VAD algorithms based on several statistical models of the probability density functions of the magnitudes. In addition, we evaluate several approaches for time smoothing the magnitude response to achieve a more robust estimate. A large data corpus of in-car noise conditions is then used to optimize the parameters of the VAD, and the results are discussed.</description><identifier>ISSN: 0001-4966</identifier><identifier>EISSN: 1520-8524</identifier><identifier>DOI: 10.1121/1.3384189</identifier><language>eng</language><ispartof>The Journal of the Acoustical Society of America, 2010-03, Vol.127 (3_Supplement), p.1816-1816</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Tashev, Ivan</creatorcontrib><creatorcontrib>Lovitt, Andrew</creatorcontrib><creatorcontrib>Acero, Alex</creatorcontrib><title>Dual stage probabilistic voice activity detector</title><title>The Journal of the Acoustical Society of America</title><description>Voice activity detectors (VADs) are critical part of every speech enhancement and speech processing system. One of the major problems in practical realizations is to achieve robust VAD in conditions of background noise. Most of the statistical model-based approaches employ the Gaussian assumption in the discrete Fourier transform domain, which deviates from the real observation. In this paper, we propose a class of VAD algorithms based on several statistical models of the probability density functions of the magnitudes. In addition, we evaluate several approaches for time smoothing the magnitude response to achieve a more robust estimate. A large data corpus of in-car noise conditions is then used to optimize the parameters of the VAD, and the results are discussed.</description><issn>0001-4966</issn><issn>1520-8524</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2010</creationdate><recordtype>article</recordtype><recordid>eNqVjksKwjAUAB-iYP0svEG2Llrzmrakaz94APchjalEKil5sdDbq9ALuBoGZjEAO-QZYo4HzISQBcp6BgmWOU9lmRdzSDjnmBZ1VS1hRfT8ailFnQA_vXXHKOqHZX3wjW5c5yg6wwbvjGXaRDe4OLK7jdZEHzawaHVHdjtxDfvL-Xa8piZ4omBb1Qf30mFUyNVvSaGalsQ_7QeP5jro</recordid><startdate>20100301</startdate><enddate>20100301</enddate><creator>Tashev, Ivan</creator><creator>Lovitt, Andrew</creator><creator>Acero, Alex</creator><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20100301</creationdate><title>Dual stage probabilistic voice activity detector</title><author>Tashev, Ivan ; Lovitt, Andrew ; Acero, Alex</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-crossref_primary_10_1121_1_33841893</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2010</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Tashev, Ivan</creatorcontrib><creatorcontrib>Lovitt, Andrew</creatorcontrib><creatorcontrib>Acero, Alex</creatorcontrib><collection>CrossRef</collection><jtitle>The Journal of the Acoustical Society of America</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Tashev, Ivan</au><au>Lovitt, Andrew</au><au>Acero, Alex</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Dual stage probabilistic voice activity detector</atitle><jtitle>The Journal of the Acoustical Society of America</jtitle><date>2010-03-01</date><risdate>2010</risdate><volume>127</volume><issue>3_Supplement</issue><spage>1816</spage><epage>1816</epage><pages>1816-1816</pages><issn>0001-4966</issn><eissn>1520-8524</eissn><abstract>Voice activity detectors (VADs) are critical part of every speech enhancement and speech processing system. One of the major problems in practical realizations is to achieve robust VAD in conditions of background noise. Most of the statistical model-based approaches employ the Gaussian assumption in the discrete Fourier transform domain, which deviates from the real observation. In this paper, we propose a class of VAD algorithms based on several statistical models of the probability density functions of the magnitudes. In addition, we evaluate several approaches for time smoothing the magnitude response to achieve a more robust estimate. A large data corpus of in-car noise conditions is then used to optimize the parameters of the VAD, and the results are discussed.</abstract><doi>10.1121/1.3384189</doi></addata></record>
fulltext fulltext
identifier ISSN: 0001-4966
ispartof The Journal of the Acoustical Society of America, 2010-03, Vol.127 (3_Supplement), p.1816-1816
issn 0001-4966
1520-8524
language eng
recordid cdi_crossref_primary_10_1121_1_3384189
source American Institute of Physics:Jisc Collections:Transitional Journals Agreement 2021-23 (Reading list)
title Dual stage probabilistic voice activity detector
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T13%3A56%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Dual%20stage%20probabilistic%20voice%20activity%20detector&rft.jtitle=The%20Journal%20of%20the%20Acoustical%20Society%20of%20America&rft.au=Tashev,%20Ivan&rft.date=2010-03-01&rft.volume=127&rft.issue=3_Supplement&rft.spage=1816&rft.epage=1816&rft.pages=1816-1816&rft.issn=0001-4966&rft.eissn=1520-8524&rft_id=info:doi/10.1121/1.3384189&rft_dat=%3Ccrossref%3E10_1121_1_3384189%3C/crossref%3E%3Cgrp_id%3Ecdi_FETCH-crossref_primary_10_1121_1_33841893%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true