Loading…

Voice Activity Detection in Noisy Environments Based on Double-Combined Fourier Transform and Line Fitting

A new voice activity detector for noisy environments is proposed. In conventional algorithms, the endpoint of speech is found by applying an edge detection filter that finds the abrupt changing point in a feature domain. However, since the frame energy feature is unstable in noisy environments, it i...

Full description

Saved in:
Bibliographic Details
Published in:TheScientificWorld 2014-01, Vol.2014 (2014), p.1-12
Main Authors: Ko, Hanseok, Han, David K., Kim, Wooil, Park, Jinsoo
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c638t-ac86e00390bd67bbc0f61e490e9675bc273b12be238a4b31b5eeabd7e071e2b13
cites cdi_FETCH-LOGICAL-c638t-ac86e00390bd67bbc0f61e490e9675bc273b12be238a4b31b5eeabd7e071e2b13
container_end_page 12
container_issue 2014
container_start_page 1
container_title TheScientificWorld
container_volume 2014
creator Ko, Hanseok
Han, David K.
Kim, Wooil
Park, Jinsoo
description A new voice activity detector for noisy environments is proposed. In conventional algorithms, the endpoint of speech is found by applying an edge detection filter that finds the abrupt changing point in a feature domain. However, since the frame energy feature is unstable in noisy environments, it is difficult to accurately find the endpoint of speech. Therefore, a novel feature extraction algorithm based on the double-combined Fourier transform and envelope line fitting is proposed. It is combined with an edge detection filter for effective detection of endpoints. Effectiveness of the proposed algorithm is evaluated and compared to other VAD algorithms using two different databases, which are AURORA 2.0 database and SITEC database. Experimental results show that the proposed algorithm performs well under a variety of noisy conditions.
doi_str_mv 10.1155/2014/146040
format article
fullrecord <record><control><sourceid>gale_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_7801d569e0eb45b5b7c8f04f5a78f75d</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A413710371</galeid><doaj_id>oai_doaj_org_article_7801d569e0eb45b5b7c8f04f5a78f75d</doaj_id><sourcerecordid>A413710371</sourcerecordid><originalsourceid>FETCH-LOGICAL-c638t-ac86e00390bd67bbc0f61e490e9675bc273b12be238a4b31b5eeabd7e071e2b13</originalsourceid><addsrcrecordid>eNqNkk1vEzEQhi0EoqFw4o4scQNta6-_di9IIW2gUgSXgrhZ_phNHWXt1rsJyr_HYUvV3pBl2Zp559E79iD0lpIzSoU4rwnl55RLwskzNKOCqUpx_us5mtVMyEpSTk7Qq2HYEMIaRcVLdFILqoioyQxtfqbgAM_dGPZhPOALGKHcU8Qh4m8pDAd8Gfchp9hDHAf82QzgcUlfpJ3dQrVIvQ2xhJZplwNkfJ1NHLqUe2yix6uSw8swjiGuX6MXndkO8Ob-PEU_lpfXi6_V6vuXq8V8VTnJmrEyrpFQrLbEeqmsdaSTFHhLoJVKWFcrZmltoWaN4ZZRKwCM9QqIolBbyk7R1cT1yWz0bQ69yQedTNB_AymvtcljcFvQqiHUC9kCAcuFFVa5piO8E0Y1nRK-sD5NrNud7cG78gbZbJ9An2ZiuNHrtNec8poKWQDv7wE53e1gGPWmPFQs_evyd0y2DZNHy2eTam2KqxC7VGCuLA99cClCF0p8zilTlJRdCj5OBS6nYcjQPViiRB-nQh-nQk9TUdTvHnfxoP03BkXwYRLchOjN7_B_NCgS6MwjMW-4lOwPMADJPA</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1553698361</pqid></control><display><type>article</type><title>Voice Activity Detection in Noisy Environments Based on Double-Combined Fourier Transform and Line Fitting</title><source>Publicly Available Content Database</source><source>Wiley_OA刊</source><source>PubMed Central</source><creator>Ko, Hanseok ; Han, David K. ; Kim, Wooil ; Park, Jinsoo</creator><contributor>Gorriz Saez, Juan Manuel</contributor><creatorcontrib>Ko, Hanseok ; Han, David K. ; Kim, Wooil ; Park, Jinsoo ; Gorriz Saez, Juan Manuel</creatorcontrib><description>A new voice activity detector for noisy environments is proposed. In conventional algorithms, the endpoint of speech is found by applying an edge detection filter that finds the abrupt changing point in a feature domain. However, since the frame energy feature is unstable in noisy environments, it is difficult to accurately find the endpoint of speech. Therefore, a novel feature extraction algorithm based on the double-combined Fourier transform and envelope line fitting is proposed. It is combined with an edge detection filter for effective detection of endpoints. Effectiveness of the proposed algorithm is evaluated and compared to other VAD algorithms using two different databases, which are AURORA 2.0 database and SITEC database. Experimental results show that the proposed algorithm performs well under a variety of noisy conditions.</description><identifier>ISSN: 2356-6140</identifier><identifier>ISSN: 1537-744X</identifier><identifier>EISSN: 1537-744X</identifier><identifier>DOI: 10.1155/2014/146040</identifier><identifier>PMID: 25170520</identifier><language>eng</language><publisher>Cairo, Egypt: Hindawi Publishing Corporation</publisher><subject>Algorithms ; Computer engineering ; Experiments ; Fourier transformations ; Fourier transforms ; Methods ; Models, Theoretical ; Noise ; Speech ; Speech Acoustics ; Voice recognition ; Wavelet transforms</subject><ispartof>TheScientificWorld, 2014-01, Vol.2014 (2014), p.1-12</ispartof><rights>Copyright © 2014 Jinsoo Park et al.</rights><rights>COPYRIGHT 2014 John Wiley &amp; Sons, Inc.</rights><rights>Copyright © 2014 Jinsoo Park et al. Jinsoo Park et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</rights><rights>Copyright © 2014 Jinsoo Park et al. 2014</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c638t-ac86e00390bd67bbc0f61e490e9675bc273b12be238a4b31b5eeabd7e071e2b13</citedby><cites>FETCH-LOGICAL-c638t-ac86e00390bd67bbc0f61e490e9675bc273b12be238a4b31b5eeabd7e071e2b13</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/1553698361/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/1553698361?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,25753,27924,27925,37012,44590,53791,53793,75126</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/25170520$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Gorriz Saez, Juan Manuel</contributor><creatorcontrib>Ko, Hanseok</creatorcontrib><creatorcontrib>Han, David K.</creatorcontrib><creatorcontrib>Kim, Wooil</creatorcontrib><creatorcontrib>Park, Jinsoo</creatorcontrib><title>Voice Activity Detection in Noisy Environments Based on Double-Combined Fourier Transform and Line Fitting</title><title>TheScientificWorld</title><addtitle>ScientificWorldJournal</addtitle><description>A new voice activity detector for noisy environments is proposed. In conventional algorithms, the endpoint of speech is found by applying an edge detection filter that finds the abrupt changing point in a feature domain. However, since the frame energy feature is unstable in noisy environments, it is difficult to accurately find the endpoint of speech. Therefore, a novel feature extraction algorithm based on the double-combined Fourier transform and envelope line fitting is proposed. It is combined with an edge detection filter for effective detection of endpoints. Effectiveness of the proposed algorithm is evaluated and compared to other VAD algorithms using two different databases, which are AURORA 2.0 database and SITEC database. Experimental results show that the proposed algorithm performs well under a variety of noisy conditions.</description><subject>Algorithms</subject><subject>Computer engineering</subject><subject>Experiments</subject><subject>Fourier transformations</subject><subject>Fourier transforms</subject><subject>Methods</subject><subject>Models, Theoretical</subject><subject>Noise</subject><subject>Speech</subject><subject>Speech Acoustics</subject><subject>Voice recognition</subject><subject>Wavelet transforms</subject><issn>2356-6140</issn><issn>1537-744X</issn><issn>1537-744X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2014</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><sourceid>DOA</sourceid><recordid>eNqNkk1vEzEQhi0EoqFw4o4scQNta6-_di9IIW2gUgSXgrhZ_phNHWXt1rsJyr_HYUvV3pBl2Zp559E79iD0lpIzSoU4rwnl55RLwskzNKOCqUpx_us5mtVMyEpSTk7Qq2HYEMIaRcVLdFILqoioyQxtfqbgAM_dGPZhPOALGKHcU8Qh4m8pDAd8Gfchp9hDHAf82QzgcUlfpJ3dQrVIvQ2xhJZplwNkfJ1NHLqUe2yix6uSw8swjiGuX6MXndkO8Ob-PEU_lpfXi6_V6vuXq8V8VTnJmrEyrpFQrLbEeqmsdaSTFHhLoJVKWFcrZmltoWaN4ZZRKwCM9QqIolBbyk7R1cT1yWz0bQ69yQedTNB_AymvtcljcFvQqiHUC9kCAcuFFVa5piO8E0Y1nRK-sD5NrNud7cG78gbZbJ9An2ZiuNHrtNec8poKWQDv7wE53e1gGPWmPFQs_evyd0y2DZNHy2eTam2KqxC7VGCuLA99cClCF0p8zilTlJRdCj5OBS6nYcjQPViiRB-nQh-nQk9TUdTvHnfxoP03BkXwYRLchOjN7_B_NCgS6MwjMW-4lOwPMADJPA</recordid><startdate>20140101</startdate><enddate>20140101</enddate><creator>Ko, Hanseok</creator><creator>Han, David K.</creator><creator>Kim, Wooil</creator><creator>Park, Jinsoo</creator><general>Hindawi Publishing Corporation</general><general>John Wiley &amp; Sons, Inc</general><general>Hindawi Limited</general><scope>ADJCN</scope><scope>AHFXO</scope><scope>RHU</scope><scope>RHW</scope><scope>RHX</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7QP</scope><scope>7TK</scope><scope>7TM</scope><scope>7X2</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>ATCPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>CWDGH</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>M0K</scope><scope>M0S</scope><scope>M1P</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>RC3</scope><scope>5PM</scope><scope>DOA</scope></search><sort><creationdate>20140101</creationdate><title>Voice Activity Detection in Noisy Environments Based on Double-Combined Fourier Transform and Line Fitting</title><author>Ko, Hanseok ; Han, David K. ; Kim, Wooil ; Park, Jinsoo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c638t-ac86e00390bd67bbc0f61e490e9675bc273b12be238a4b31b5eeabd7e071e2b13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2014</creationdate><topic>Algorithms</topic><topic>Computer engineering</topic><topic>Experiments</topic><topic>Fourier transformations</topic><topic>Fourier transforms</topic><topic>Methods</topic><topic>Models, Theoretical</topic><topic>Noise</topic><topic>Speech</topic><topic>Speech Acoustics</topic><topic>Voice recognition</topic><topic>Wavelet transforms</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ko, Hanseok</creatorcontrib><creatorcontrib>Han, David K.</creatorcontrib><creatorcontrib>Kim, Wooil</creatorcontrib><creatorcontrib>Park, Jinsoo</creatorcontrib><collection>الدوريات العلمية والإحصائية - e-Marefa Academic and Statistical Periodicals</collection><collection>معرفة - المحتوى العربي الأكاديمي المتكامل - e-Marefa Academic Complete</collection><collection>Hindawi Publishing Complete</collection><collection>Hindawi Publishing Subscription Journals</collection><collection>Hindawi Publishing Open Access</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Calcium &amp; Calcified Tissue Abstracts</collection><collection>Neurosciences Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Agricultural Science Collection</collection><collection>ProQuest_Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies &amp; Aerospace Database‎ (1962 - current)</collection><collection>Agricultural &amp; Environmental Science Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>Middle East &amp; Africa Database</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Agricultural Science Database</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>PML(ProQuest Medical Library)</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>Genetics Abstracts</collection><collection>PubMed Central (Full Participant titles)</collection><collection>Directory of Open Access Journals</collection><jtitle>TheScientificWorld</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ko, Hanseok</au><au>Han, David K.</au><au>Kim, Wooil</au><au>Park, Jinsoo</au><au>Gorriz Saez, Juan Manuel</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Voice Activity Detection in Noisy Environments Based on Double-Combined Fourier Transform and Line Fitting</atitle><jtitle>TheScientificWorld</jtitle><addtitle>ScientificWorldJournal</addtitle><date>2014-01-01</date><risdate>2014</risdate><volume>2014</volume><issue>2014</issue><spage>1</spage><epage>12</epage><pages>1-12</pages><issn>2356-6140</issn><issn>1537-744X</issn><eissn>1537-744X</eissn><abstract>A new voice activity detector for noisy environments is proposed. In conventional algorithms, the endpoint of speech is found by applying an edge detection filter that finds the abrupt changing point in a feature domain. However, since the frame energy feature is unstable in noisy environments, it is difficult to accurately find the endpoint of speech. Therefore, a novel feature extraction algorithm based on the double-combined Fourier transform and envelope line fitting is proposed. It is combined with an edge detection filter for effective detection of endpoints. Effectiveness of the proposed algorithm is evaluated and compared to other VAD algorithms using two different databases, which are AURORA 2.0 database and SITEC database. Experimental results show that the proposed algorithm performs well under a variety of noisy conditions.</abstract><cop>Cairo, Egypt</cop><pub>Hindawi Publishing Corporation</pub><pmid>25170520</pmid><doi>10.1155/2014/146040</doi><tpages>12</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2356-6140
ispartof TheScientificWorld, 2014-01, Vol.2014 (2014), p.1-12
issn 2356-6140
1537-744X
1537-744X
language eng
recordid cdi_doaj_primary_oai_doaj_org_article_7801d569e0eb45b5b7c8f04f5a78f75d
source Publicly Available Content Database; Wiley_OA刊; PubMed Central
subjects Algorithms
Computer engineering
Experiments
Fourier transformations
Fourier transforms
Methods
Models, Theoretical
Noise
Speech
Speech Acoustics
Voice recognition
Wavelet transforms
title Voice Activity Detection in Noisy Environments Based on Double-Combined Fourier Transform and Line Fitting
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-20T14%3A27%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Voice%20Activity%20Detection%20in%20Noisy%20Environments%20Based%20on%20Double-Combined%20Fourier%20Transform%20and%20Line%20Fitting&rft.jtitle=TheScientificWorld&rft.au=Ko,%20Hanseok&rft.date=2014-01-01&rft.volume=2014&rft.issue=2014&rft.spage=1&rft.epage=12&rft.pages=1-12&rft.issn=2356-6140&rft.eissn=1537-744X&rft_id=info:doi/10.1155/2014/146040&rft_dat=%3Cgale_doaj_%3EA413710371%3C/gale_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c638t-ac86e00390bd67bbc0f61e490e9675bc273b12be238a4b31b5eeabd7e071e2b13%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1553698361&rft_id=info:pmid/25170520&rft_galeid=A413710371&rfr_iscdi=true