Loading…
Dual stage probabilistic voice activity detector
Voice activity detectors (VADs) are critical part of every speech enhancement and speech processing system. One of the major problems in practical realizations is to achieve robust VAD in conditions of background noise. Most of the statistical model-based approaches employ the Gaussian assumption in...
Saved in:
Published in: | The Journal of the Acoustical Society of America 2010-03, Vol.127 (3_Supplement), p.1816-1816 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | 1816 |
container_issue | 3_Supplement |
container_start_page | 1816 |
container_title | The Journal of the Acoustical Society of America |
container_volume | 127 |
creator | Tashev, Ivan Lovitt, Andrew Acero, Alex |
description | Voice activity detectors (VADs) are critical part of every speech enhancement and speech processing system. One of the major problems in practical realizations is to achieve robust VAD in conditions of background noise. Most of the statistical model-based approaches employ the Gaussian assumption in the discrete Fourier transform domain, which deviates from the real observation. In this paper, we propose a class of VAD algorithms based on several statistical models of the probability density functions of the magnitudes. In addition, we evaluate several approaches for time smoothing the magnitude response to achieve a more robust estimate. A large data corpus of in-car noise conditions is then used to optimize the parameters of the VAD, and the results are discussed. |
doi_str_mv | 10.1121/1.3384189 |
format | article |
fullrecord | <record><control><sourceid>crossref</sourceid><recordid>TN_cdi_crossref_primary_10_1121_1_3384189</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>10_1121_1_3384189</sourcerecordid><originalsourceid>FETCH-crossref_primary_10_1121_1_33841893</originalsourceid><addsrcrecordid>eNqVjksKwjAUAB-iYP0svEG2Llrzmrakaz94APchjalEKil5sdDbq9ALuBoGZjEAO-QZYo4HzISQBcp6BgmWOU9lmRdzSDjnmBZ1VS1hRfT8ailFnQA_vXXHKOqHZX3wjW5c5yg6wwbvjGXaRDe4OLK7jdZEHzawaHVHdjtxDfvL-Xa8piZ4omBb1Qf30mFUyNVvSaGalsQ_7QeP5jro</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Dual stage probabilistic voice activity detector</title><source>American Institute of Physics:Jisc Collections:Transitional Journals Agreement 2021-23 (Reading list)</source><creator>Tashev, Ivan ; Lovitt, Andrew ; Acero, Alex</creator><creatorcontrib>Tashev, Ivan ; Lovitt, Andrew ; Acero, Alex</creatorcontrib><description>Voice activity detectors (VADs) are critical part of every speech enhancement and speech processing system. One of the major problems in practical realizations is to achieve robust VAD in conditions of background noise. Most of the statistical model-based approaches employ the Gaussian assumption in the discrete Fourier transform domain, which deviates from the real observation. In this paper, we propose a class of VAD algorithms based on several statistical models of the probability density functions of the magnitudes. In addition, we evaluate several approaches for time smoothing the magnitude response to achieve a more robust estimate. A large data corpus of in-car noise conditions is then used to optimize the parameters of the VAD, and the results are discussed.</description><identifier>ISSN: 0001-4966</identifier><identifier>EISSN: 1520-8524</identifier><identifier>DOI: 10.1121/1.3384189</identifier><language>eng</language><ispartof>The Journal of the Acoustical Society of America, 2010-03, Vol.127 (3_Supplement), p.1816-1816</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Tashev, Ivan</creatorcontrib><creatorcontrib>Lovitt, Andrew</creatorcontrib><creatorcontrib>Acero, Alex</creatorcontrib><title>Dual stage probabilistic voice activity detector</title><title>The Journal of the Acoustical Society of America</title><description>Voice activity detectors (VADs) are critical part of every speech enhancement and speech processing system. One of the major problems in practical realizations is to achieve robust VAD in conditions of background noise. Most of the statistical model-based approaches employ the Gaussian assumption in the discrete Fourier transform domain, which deviates from the real observation. In this paper, we propose a class of VAD algorithms based on several statistical models of the probability density functions of the magnitudes. In addition, we evaluate several approaches for time smoothing the magnitude response to achieve a more robust estimate. A large data corpus of in-car noise conditions is then used to optimize the parameters of the VAD, and the results are discussed.</description><issn>0001-4966</issn><issn>1520-8524</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2010</creationdate><recordtype>article</recordtype><recordid>eNqVjksKwjAUAB-iYP0svEG2Llrzmrakaz94APchjalEKil5sdDbq9ALuBoGZjEAO-QZYo4HzISQBcp6BgmWOU9lmRdzSDjnmBZ1VS1hRfT8ailFnQA_vXXHKOqHZX3wjW5c5yg6wwbvjGXaRDe4OLK7jdZEHzawaHVHdjtxDfvL-Xa8piZ4omBb1Qf30mFUyNVvSaGalsQ_7QeP5jro</recordid><startdate>20100301</startdate><enddate>20100301</enddate><creator>Tashev, Ivan</creator><creator>Lovitt, Andrew</creator><creator>Acero, Alex</creator><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20100301</creationdate><title>Dual stage probabilistic voice activity detector</title><author>Tashev, Ivan ; Lovitt, Andrew ; Acero, Alex</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-crossref_primary_10_1121_1_33841893</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2010</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Tashev, Ivan</creatorcontrib><creatorcontrib>Lovitt, Andrew</creatorcontrib><creatorcontrib>Acero, Alex</creatorcontrib><collection>CrossRef</collection><jtitle>The Journal of the Acoustical Society of America</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Tashev, Ivan</au><au>Lovitt, Andrew</au><au>Acero, Alex</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Dual stage probabilistic voice activity detector</atitle><jtitle>The Journal of the Acoustical Society of America</jtitle><date>2010-03-01</date><risdate>2010</risdate><volume>127</volume><issue>3_Supplement</issue><spage>1816</spage><epage>1816</epage><pages>1816-1816</pages><issn>0001-4966</issn><eissn>1520-8524</eissn><abstract>Voice activity detectors (VADs) are critical part of every speech enhancement and speech processing system. One of the major problems in practical realizations is to achieve robust VAD in conditions of background noise. Most of the statistical model-based approaches employ the Gaussian assumption in the discrete Fourier transform domain, which deviates from the real observation. In this paper, we propose a class of VAD algorithms based on several statistical models of the probability density functions of the magnitudes. In addition, we evaluate several approaches for time smoothing the magnitude response to achieve a more robust estimate. A large data corpus of in-car noise conditions is then used to optimize the parameters of the VAD, and the results are discussed.</abstract><doi>10.1121/1.3384189</doi></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0001-4966 |
ispartof | The Journal of the Acoustical Society of America, 2010-03, Vol.127 (3_Supplement), p.1816-1816 |
issn | 0001-4966 1520-8524 |
language | eng |
recordid | cdi_crossref_primary_10_1121_1_3384189 |
source | American Institute of Physics:Jisc Collections:Transitional Journals Agreement 2021-23 (Reading list) |
title | Dual stage probabilistic voice activity detector |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T13%3A56%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Dual%20stage%20probabilistic%20voice%20activity%20detector&rft.jtitle=The%20Journal%20of%20the%20Acoustical%20Society%20of%20America&rft.au=Tashev,%20Ivan&rft.date=2010-03-01&rft.volume=127&rft.issue=3_Supplement&rft.spage=1816&rft.epage=1816&rft.pages=1816-1816&rft.issn=0001-4966&rft.eissn=1520-8524&rft_id=info:doi/10.1121/1.3384189&rft_dat=%3Ccrossref%3E10_1121_1_3384189%3C/crossref%3E%3Cgrp_id%3Ecdi_FETCH-crossref_primary_10_1121_1_33841893%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |