Loading…
Voice Activity Detection in Noisy Environments Based on Double-Combined Fourier Transform and Line Fitting
A new voice activity detector for noisy environments is proposed. In conventional algorithms, the endpoint of speech is found by applying an edge detection filter that finds the abrupt changing point in a feature domain. However, since the frame energy feature is unstable in noisy environments, it i...
Saved in:
Published in: | TheScientificWorld 2014-01, Vol.2014 (2014), p.1-12 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c638t-ac86e00390bd67bbc0f61e490e9675bc273b12be238a4b31b5eeabd7e071e2b13 |
---|---|
cites | cdi_FETCH-LOGICAL-c638t-ac86e00390bd67bbc0f61e490e9675bc273b12be238a4b31b5eeabd7e071e2b13 |
container_end_page | 12 |
container_issue | 2014 |
container_start_page | 1 |
container_title | TheScientificWorld |
container_volume | 2014 |
creator | Ko, Hanseok Han, David K. Kim, Wooil Park, Jinsoo |
description | A new voice activity detector for noisy environments is proposed. In conventional algorithms, the endpoint of speech is found by applying an edge detection filter that finds the abrupt changing point in a feature domain. However, since the frame energy feature is unstable in noisy environments, it is difficult to accurately find the endpoint of speech. Therefore, a novel feature extraction algorithm based on the double-combined Fourier transform and envelope line fitting is proposed. It is combined with an edge detection filter for effective detection of endpoints. Effectiveness of the proposed algorithm is evaluated and compared to other VAD algorithms using two different databases, which are AURORA 2.0 database and SITEC database. Experimental results show that the proposed algorithm performs well under a variety of noisy conditions. |
doi_str_mv | 10.1155/2014/146040 |
format | article |
fullrecord | <record><control><sourceid>gale_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_7801d569e0eb45b5b7c8f04f5a78f75d</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A413710371</galeid><doaj_id>oai_doaj_org_article_7801d569e0eb45b5b7c8f04f5a78f75d</doaj_id><sourcerecordid>A413710371</sourcerecordid><originalsourceid>FETCH-LOGICAL-c638t-ac86e00390bd67bbc0f61e490e9675bc273b12be238a4b31b5eeabd7e071e2b13</originalsourceid><addsrcrecordid>eNqNkk1vEzEQhi0EoqFw4o4scQNta6-_di9IIW2gUgSXgrhZ_phNHWXt1rsJyr_HYUvV3pBl2Zp559E79iD0lpIzSoU4rwnl55RLwskzNKOCqUpx_us5mtVMyEpSTk7Qq2HYEMIaRcVLdFILqoioyQxtfqbgAM_dGPZhPOALGKHcU8Qh4m8pDAd8Gfchp9hDHAf82QzgcUlfpJ3dQrVIvQ2xhJZplwNkfJ1NHLqUe2yix6uSw8swjiGuX6MXndkO8Ob-PEU_lpfXi6_V6vuXq8V8VTnJmrEyrpFQrLbEeqmsdaSTFHhLoJVKWFcrZmltoWaN4ZZRKwCM9QqIolBbyk7R1cT1yWz0bQ69yQedTNB_AymvtcljcFvQqiHUC9kCAcuFFVa5piO8E0Y1nRK-sD5NrNud7cG78gbZbJ9An2ZiuNHrtNec8poKWQDv7wE53e1gGPWmPFQs_evyd0y2DZNHy2eTam2KqxC7VGCuLA99cClCF0p8zilTlJRdCj5OBS6nYcjQPViiRB-nQh-nQk9TUdTvHnfxoP03BkXwYRLchOjN7_B_NCgS6MwjMW-4lOwPMADJPA</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1553698361</pqid></control><display><type>article</type><title>Voice Activity Detection in Noisy Environments Based on Double-Combined Fourier Transform and Line Fitting</title><source>Publicly Available Content Database</source><source>Wiley_OA刊</source><source>PubMed Central</source><creator>Ko, Hanseok ; Han, David K. ; Kim, Wooil ; Park, Jinsoo</creator><contributor>Gorriz Saez, Juan Manuel</contributor><creatorcontrib>Ko, Hanseok ; Han, David K. ; Kim, Wooil ; Park, Jinsoo ; Gorriz Saez, Juan Manuel</creatorcontrib><description>A new voice activity detector for noisy environments is proposed. In conventional algorithms, the endpoint of speech is found by applying an edge detection filter that finds the abrupt changing point in a feature domain. However, since the frame energy feature is unstable in noisy environments, it is difficult to accurately find the endpoint of speech. Therefore, a novel feature extraction algorithm based on the double-combined Fourier transform and envelope line fitting is proposed. It is combined with an edge detection filter for effective detection of endpoints. Effectiveness of the proposed algorithm is evaluated and compared to other VAD algorithms using two different databases, which are AURORA 2.0 database and SITEC database. Experimental results show that the proposed algorithm performs well under a variety of noisy conditions.</description><identifier>ISSN: 2356-6140</identifier><identifier>ISSN: 1537-744X</identifier><identifier>EISSN: 1537-744X</identifier><identifier>DOI: 10.1155/2014/146040</identifier><identifier>PMID: 25170520</identifier><language>eng</language><publisher>Cairo, Egypt: Hindawi Publishing Corporation</publisher><subject>Algorithms ; Computer engineering ; Experiments ; Fourier transformations ; Fourier transforms ; Methods ; Models, Theoretical ; Noise ; Speech ; Speech Acoustics ; Voice recognition ; Wavelet transforms</subject><ispartof>TheScientificWorld, 2014-01, Vol.2014 (2014), p.1-12</ispartof><rights>Copyright © 2014 Jinsoo Park et al.</rights><rights>COPYRIGHT 2014 John Wiley & Sons, Inc.</rights><rights>Copyright © 2014 Jinsoo Park et al. Jinsoo Park et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</rights><rights>Copyright © 2014 Jinsoo Park et al. 2014</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c638t-ac86e00390bd67bbc0f61e490e9675bc273b12be238a4b31b5eeabd7e071e2b13</citedby><cites>FETCH-LOGICAL-c638t-ac86e00390bd67bbc0f61e490e9675bc273b12be238a4b31b5eeabd7e071e2b13</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/1553698361/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/1553698361?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,25753,27924,27925,37012,44590,53791,53793,75126</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/25170520$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Gorriz Saez, Juan Manuel</contributor><creatorcontrib>Ko, Hanseok</creatorcontrib><creatorcontrib>Han, David K.</creatorcontrib><creatorcontrib>Kim, Wooil</creatorcontrib><creatorcontrib>Park, Jinsoo</creatorcontrib><title>Voice Activity Detection in Noisy Environments Based on Double-Combined Fourier Transform and Line Fitting</title><title>TheScientificWorld</title><addtitle>ScientificWorldJournal</addtitle><description>A new voice activity detector for noisy environments is proposed. In conventional algorithms, the endpoint of speech is found by applying an edge detection filter that finds the abrupt changing point in a feature domain. However, since the frame energy feature is unstable in noisy environments, it is difficult to accurately find the endpoint of speech. Therefore, a novel feature extraction algorithm based on the double-combined Fourier transform and envelope line fitting is proposed. It is combined with an edge detection filter for effective detection of endpoints. Effectiveness of the proposed algorithm is evaluated and compared to other VAD algorithms using two different databases, which are AURORA 2.0 database and SITEC database. Experimental results show that the proposed algorithm performs well under a variety of noisy conditions.</description><subject>Algorithms</subject><subject>Computer engineering</subject><subject>Experiments</subject><subject>Fourier transformations</subject><subject>Fourier transforms</subject><subject>Methods</subject><subject>Models, Theoretical</subject><subject>Noise</subject><subject>Speech</subject><subject>Speech Acoustics</subject><subject>Voice recognition</subject><subject>Wavelet transforms</subject><issn>2356-6140</issn><issn>1537-744X</issn><issn>1537-744X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2014</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><sourceid>DOA</sourceid><recordid>eNqNkk1vEzEQhi0EoqFw4o4scQNta6-_di9IIW2gUgSXgrhZ_phNHWXt1rsJyr_HYUvV3pBl2Zp559E79iD0lpIzSoU4rwnl55RLwskzNKOCqUpx_us5mtVMyEpSTk7Qq2HYEMIaRcVLdFILqoioyQxtfqbgAM_dGPZhPOALGKHcU8Qh4m8pDAd8Gfchp9hDHAf82QzgcUlfpJ3dQrVIvQ2xhJZplwNkfJ1NHLqUe2yix6uSw8swjiGuX6MXndkO8Ob-PEU_lpfXi6_V6vuXq8V8VTnJmrEyrpFQrLbEeqmsdaSTFHhLoJVKWFcrZmltoWaN4ZZRKwCM9QqIolBbyk7R1cT1yWz0bQ69yQedTNB_AymvtcljcFvQqiHUC9kCAcuFFVa5piO8E0Y1nRK-sD5NrNud7cG78gbZbJ9An2ZiuNHrtNec8poKWQDv7wE53e1gGPWmPFQs_evyd0y2DZNHy2eTam2KqxC7VGCuLA99cClCF0p8zilTlJRdCj5OBS6nYcjQPViiRB-nQh-nQk9TUdTvHnfxoP03BkXwYRLchOjN7_B_NCgS6MwjMW-4lOwPMADJPA</recordid><startdate>20140101</startdate><enddate>20140101</enddate><creator>Ko, Hanseok</creator><creator>Han, David K.</creator><creator>Kim, Wooil</creator><creator>Park, Jinsoo</creator><general>Hindawi Publishing Corporation</general><general>John Wiley & Sons, Inc</general><general>Hindawi Limited</general><scope>ADJCN</scope><scope>AHFXO</scope><scope>RHU</scope><scope>RHW</scope><scope>RHX</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7QP</scope><scope>7TK</scope><scope>7TM</scope><scope>7X2</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>ATCPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>CWDGH</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>M0K</scope><scope>M0S</scope><scope>M1P</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>RC3</scope><scope>5PM</scope><scope>DOA</scope></search><sort><creationdate>20140101</creationdate><title>Voice Activity Detection in Noisy Environments Based on Double-Combined Fourier Transform and Line Fitting</title><author>Ko, Hanseok ; Han, David K. ; Kim, Wooil ; Park, Jinsoo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c638t-ac86e00390bd67bbc0f61e490e9675bc273b12be238a4b31b5eeabd7e071e2b13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2014</creationdate><topic>Algorithms</topic><topic>Computer engineering</topic><topic>Experiments</topic><topic>Fourier transformations</topic><topic>Fourier transforms</topic><topic>Methods</topic><topic>Models, Theoretical</topic><topic>Noise</topic><topic>Speech</topic><topic>Speech Acoustics</topic><topic>Voice recognition</topic><topic>Wavelet transforms</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ko, Hanseok</creatorcontrib><creatorcontrib>Han, David K.</creatorcontrib><creatorcontrib>Kim, Wooil</creatorcontrib><creatorcontrib>Park, Jinsoo</creatorcontrib><collection>الدوريات العلمية والإحصائية - e-Marefa Academic and Statistical Periodicals</collection><collection>معرفة - المحتوى العربي الأكاديمي المتكامل - e-Marefa Academic Complete</collection><collection>Hindawi Publishing Complete</collection><collection>Hindawi Publishing Subscription Journals</collection><collection>Hindawi Publishing Open Access</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Calcium & Calcified Tissue Abstracts</collection><collection>Neurosciences Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Agricultural Science Collection</collection><collection>ProQuest_Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies & Aerospace Database (1962 - current)</collection><collection>Agricultural & Environmental Science Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>Middle East & Africa Database</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Agricultural Science Database</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>PML(ProQuest Medical Library)</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>Genetics Abstracts</collection><collection>PubMed Central (Full Participant titles)</collection><collection>Directory of Open Access Journals</collection><jtitle>TheScientificWorld</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ko, Hanseok</au><au>Han, David K.</au><au>Kim, Wooil</au><au>Park, Jinsoo</au><au>Gorriz Saez, Juan Manuel</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Voice Activity Detection in Noisy Environments Based on Double-Combined Fourier Transform and Line Fitting</atitle><jtitle>TheScientificWorld</jtitle><addtitle>ScientificWorldJournal</addtitle><date>2014-01-01</date><risdate>2014</risdate><volume>2014</volume><issue>2014</issue><spage>1</spage><epage>12</epage><pages>1-12</pages><issn>2356-6140</issn><issn>1537-744X</issn><eissn>1537-744X</eissn><abstract>A new voice activity detector for noisy environments is proposed. In conventional algorithms, the endpoint of speech is found by applying an edge detection filter that finds the abrupt changing point in a feature domain. However, since the frame energy feature is unstable in noisy environments, it is difficult to accurately find the endpoint of speech. Therefore, a novel feature extraction algorithm based on the double-combined Fourier transform and envelope line fitting is proposed. It is combined with an edge detection filter for effective detection of endpoints. Effectiveness of the proposed algorithm is evaluated and compared to other VAD algorithms using two different databases, which are AURORA 2.0 database and SITEC database. Experimental results show that the proposed algorithm performs well under a variety of noisy conditions.</abstract><cop>Cairo, Egypt</cop><pub>Hindawi Publishing Corporation</pub><pmid>25170520</pmid><doi>10.1155/2014/146040</doi><tpages>12</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2356-6140 |
ispartof | TheScientificWorld, 2014-01, Vol.2014 (2014), p.1-12 |
issn | 2356-6140 1537-744X 1537-744X |
language | eng |
recordid | cdi_doaj_primary_oai_doaj_org_article_7801d569e0eb45b5b7c8f04f5a78f75d |
source | Publicly Available Content Database; Wiley_OA刊; PubMed Central |
subjects | Algorithms Computer engineering Experiments Fourier transformations Fourier transforms Methods Models, Theoretical Noise Speech Speech Acoustics Voice recognition Wavelet transforms |
title | Voice Activity Detection in Noisy Environments Based on Double-Combined Fourier Transform and Line Fitting |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-20T14%3A27%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Voice%20Activity%20Detection%20in%20Noisy%20Environments%20Based%20on%20Double-Combined%20Fourier%20Transform%20and%20Line%20Fitting&rft.jtitle=TheScientificWorld&rft.au=Ko,%20Hanseok&rft.date=2014-01-01&rft.volume=2014&rft.issue=2014&rft.spage=1&rft.epage=12&rft.pages=1-12&rft.issn=2356-6140&rft.eissn=1537-744X&rft_id=info:doi/10.1155/2014/146040&rft_dat=%3Cgale_doaj_%3EA413710371%3C/gale_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c638t-ac86e00390bd67bbc0f61e490e9675bc273b12be238a4b31b5eeabd7e071e2b13%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1553698361&rft_id=info:pmid/25170520&rft_galeid=A413710371&rfr_iscdi=true |