Loading…

Adaptive selection of local and non-local attention mechanisms for speech enhancement

In speech enhancement tasks, local and non-local attention mechanisms have been significantly improved and well studied. However, a natural speech signal contains many dynamic and fast-changing acoustic features, and focusing on one type of attention mechanism (local or non-local) cannot precisely c...

Full description

Saved in:

Bibliographic Details
Published in:	Neural networks 2024-06, Vol.174, p.106236-106236, Article 106236
Main Authors:	Xu, Xinmeng, Tu, Weiping, Yang, Yuhong
Format:	Article
Language:	English
Subjects:	Adaptive selection Difficulty-adjusted reward Local and non-local attention Reinforcement learning Speech enhancement
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites	cdi_FETCH-LOGICAL-c311t-13c09ff29a75ce24e8a9ee8b567e3fcf8b0aa695b4c5e81f03eb2ff836336d933
container_end_page	106236
container_issue
container_start_page	106236
container_title	Neural networks
container_volume	174
creator	Xu, Xinmeng Tu, Weiping Yang, Yuhong
description	In speech enhancement tasks, local and non-local attention mechanisms have been significantly improved and well studied. However, a natural speech signal contains many dynamic and fast-changing acoustic features, and focusing on one type of attention mechanism (local or non-local) cannot precisely capture the most discriminative information for estimating target speech from background interference. To address this issue, we introduce an adaptive selection network to dynamically select an appropriate route that determines whether to use the attention mechanisms and which to use for the task. We train the adaptive selection network using reinforcement learning with a developed difficulty-adjusted reward that is related to the performance, complexity, and difficulty of target speech estimation from the noisy mixtures. Consequently, we propose an Attention Selection Speech Enhancement Network (ASSENet) with the innovative dynamic block that consists of an adaptive selection network and a local and non-local attention based speech enhancement network. In particular, the ASSENet incorporates both local and non-local attention and develops the attention mechanism selection technique to explore the appropriate route of local and non-local attention mechanisms for speech enhancement tasks. The results show that our method achieves comparable and superior performance to existing approaches with attractive computational costs.
doi_str_mv	10.1016/j.neunet.2024.106236
format	article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2974004890</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0893608024001606</els_id><sourcerecordid>2974004890</sourcerecordid><originalsourceid>FETCH-LOGICAL-c311t-13c09ff29a75ce24e8a9ee8b567e3fcf8b0aa695b4c5e81f03eb2ff836336d933</originalsourceid><addsrcrecordid>eNp9kE1LxDAQhoMoun78A5EevXSdJG2aXAQRv2DBi55Dmk4wS5usTVfw3xvt6tHTMC_PzDAPIecUlhSouFovA24DTksGrMqRYFzskQWVjSpZI9k-WYBUvBQg4Ygcp7QGACErfkiOuKwzR2FBXm86s5n8BxYJe7STj6GIruijNX1hQleEGMpdN00YfoAB7ZsJPg2pcHEs0gZzUGDIocUhQ6fkwJk-4dmunpDX-7uX28dy9fzwdHuzKi2ndCopt6CcY8o0tUVWoTQKUba1aJA762QLxghVt5WtUVIHHFvmnOSCc9Epzk_I5bx3M8b3LaZJDz5Z7HsTMG6TZqqpACqpIKPVjNoxpjSi05vRD2b81BT0t1C91rNQ_S1Uz0Lz2MXuwrYdsPsb-jWYgesZwPznh8dRJ-sxe-j8mH3qLvr_L3wBaZ-J4w</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2974004890</pqid></control><display><type>article</type><title>Adaptive selection of local and non-local attention mechanisms for speech enhancement</title><source>Elsevier</source><creator>Xu, Xinmeng ; Tu, Weiping ; Yang, Yuhong</creator><creatorcontrib>Xu, Xinmeng ; Tu, Weiping ; Yang, Yuhong</creatorcontrib><description>In speech enhancement tasks, local and non-local attention mechanisms have been significantly improved and well studied. However, a natural speech signal contains many dynamic and fast-changing acoustic features, and focusing on one type of attention mechanism (local or non-local) cannot precisely capture the most discriminative information for estimating target speech from background interference. To address this issue, we introduce an adaptive selection network to dynamically select an appropriate route that determines whether to use the attention mechanisms and which to use for the task. We train the adaptive selection network using reinforcement learning with a developed difficulty-adjusted reward that is related to the performance, complexity, and difficulty of target speech estimation from the noisy mixtures. Consequently, we propose an Attention Selection Speech Enhancement Network (ASSENet) with the innovative dynamic block that consists of an adaptive selection network and a local and non-local attention based speech enhancement network. In particular, the ASSENet incorporates both local and non-local attention and develops the attention mechanism selection technique to explore the appropriate route of local and non-local attention mechanisms for speech enhancement tasks. The results show that our method achieves comparable and superior performance to existing approaches with attractive computational costs.</description><identifier>ISSN: 0893-6080</identifier><identifier>EISSN: 1879-2782</identifier><identifier>DOI: 10.1016/j.neunet.2024.106236</identifier><identifier>PMID: 38518710</identifier><language>eng</language><publisher>United States: Elsevier Ltd</publisher><subject>Adaptive selection ; Difficulty-adjusted reward ; Local and non-local attention ; Reinforcement learning ; Speech enhancement</subject><ispartof>Neural networks, 2024-06, Vol.174, p.106236-106236, Article 106236</ispartof><rights>2024 Elsevier Ltd</rights><rights>Copyright © 2024 Elsevier Ltd. All rights reserved.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c311t-13c09ff29a75ce24e8a9ee8b567e3fcf8b0aa695b4c5e81f03eb2ff836336d933</cites><orcidid>0000-0002-6933-3298 ; 0000-0003-3001-7957</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27903,27904</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/38518710$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Xu, Xinmeng</creatorcontrib><creatorcontrib>Tu, Weiping</creatorcontrib><creatorcontrib>Yang, Yuhong</creatorcontrib><title>Adaptive selection of local and non-local attention mechanisms for speech enhancement</title><title>Neural networks</title><addtitle>Neural Netw</addtitle><description>In speech enhancement tasks, local and non-local attention mechanisms have been significantly improved and well studied. However, a natural speech signal contains many dynamic and fast-changing acoustic features, and focusing on one type of attention mechanism (local or non-local) cannot precisely capture the most discriminative information for estimating target speech from background interference. To address this issue, we introduce an adaptive selection network to dynamically select an appropriate route that determines whether to use the attention mechanisms and which to use for the task. We train the adaptive selection network using reinforcement learning with a developed difficulty-adjusted reward that is related to the performance, complexity, and difficulty of target speech estimation from the noisy mixtures. Consequently, we propose an Attention Selection Speech Enhancement Network (ASSENet) with the innovative dynamic block that consists of an adaptive selection network and a local and non-local attention based speech enhancement network. In particular, the ASSENet incorporates both local and non-local attention and develops the attention mechanism selection technique to explore the appropriate route of local and non-local attention mechanisms for speech enhancement tasks. The results show that our method achieves comparable and superior performance to existing approaches with attractive computational costs.</description><subject>Adaptive selection</subject><subject>Difficulty-adjusted reward</subject><subject>Local and non-local attention</subject><subject>Reinforcement learning</subject><subject>Speech enhancement</subject><issn>0893-6080</issn><issn>1879-2782</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kE1LxDAQhoMoun78A5EevXSdJG2aXAQRv2DBi55Dmk4wS5usTVfw3xvt6tHTMC_PzDAPIecUlhSouFovA24DTksGrMqRYFzskQWVjSpZI9k-WYBUvBQg4Ygcp7QGACErfkiOuKwzR2FBXm86s5n8BxYJe7STj6GIruijNX1hQleEGMpdN00YfoAB7ZsJPg2pcHEs0gZzUGDIocUhQ6fkwJk-4dmunpDX-7uX28dy9fzwdHuzKi2ndCopt6CcY8o0tUVWoTQKUba1aJA762QLxghVt5WtUVIHHFvmnOSCc9Epzk_I5bx3M8b3LaZJDz5Z7HsTMG6TZqqpACqpIKPVjNoxpjSi05vRD2b81BT0t1C91rNQ_S1Uz0Lz2MXuwrYdsPsb-jWYgesZwPznh8dRJ-sxe-j8mH3qLvr_L3wBaZ-J4w</recordid><startdate>20240601</startdate><enddate>20240601</enddate><creator>Xu, Xinmeng</creator><creator>Tu, Weiping</creator><creator>Yang, Yuhong</creator><general>Elsevier Ltd</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-6933-3298</orcidid><orcidid>https://orcid.org/0000-0003-3001-7957</orcidid></search><sort><creationdate>20240601</creationdate><title>Adaptive selection of local and non-local attention mechanisms for speech enhancement</title><author>Xu, Xinmeng ; Tu, Weiping ; Yang, Yuhong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c311t-13c09ff29a75ce24e8a9ee8b567e3fcf8b0aa695b4c5e81f03eb2ff836336d933</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Adaptive selection</topic><topic>Difficulty-adjusted reward</topic><topic>Local and non-local attention</topic><topic>Reinforcement learning</topic><topic>Speech enhancement</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Xu, Xinmeng</creatorcontrib><creatorcontrib>Tu, Weiping</creatorcontrib><creatorcontrib>Yang, Yuhong</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Neural networks</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Xu, Xinmeng</au><au>Tu, Weiping</au><au>Yang, Yuhong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Adaptive selection of local and non-local attention mechanisms for speech enhancement</atitle><jtitle>Neural networks</jtitle><addtitle>Neural Netw</addtitle><date>2024-06-01</date><risdate>2024</risdate><volume>174</volume><spage>106236</spage><epage>106236</epage><pages>106236-106236</pages><artnum>106236</artnum><issn>0893-6080</issn><eissn>1879-2782</eissn><abstract>In speech enhancement tasks, local and non-local attention mechanisms have been significantly improved and well studied. However, a natural speech signal contains many dynamic and fast-changing acoustic features, and focusing on one type of attention mechanism (local or non-local) cannot precisely capture the most discriminative information for estimating target speech from background interference. To address this issue, we introduce an adaptive selection network to dynamically select an appropriate route that determines whether to use the attention mechanisms and which to use for the task. We train the adaptive selection network using reinforcement learning with a developed difficulty-adjusted reward that is related to the performance, complexity, and difficulty of target speech estimation from the noisy mixtures. Consequently, we propose an Attention Selection Speech Enhancement Network (ASSENet) with the innovative dynamic block that consists of an adaptive selection network and a local and non-local attention based speech enhancement network. In particular, the ASSENet incorporates both local and non-local attention and develops the attention mechanism selection technique to explore the appropriate route of local and non-local attention mechanisms for speech enhancement tasks. The results show that our method achieves comparable and superior performance to existing approaches with attractive computational costs.</abstract><cop>United States</cop><pub>Elsevier Ltd</pub><pmid>38518710</pmid><doi>10.1016/j.neunet.2024.106236</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0002-6933-3298</orcidid><orcidid>https://orcid.org/0000-0003-3001-7957</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 0893-6080
ispartof	Neural networks, 2024-06, Vol.174, p.106236-106236, Article 106236
issn	0893-6080 1879-2782
language	eng
recordid	cdi_proquest_miscellaneous_2974004890
source	Elsevier
subjects	Adaptive selection Difficulty-adjusted reward Local and non-local attention Reinforcement learning Speech enhancement
title	Adaptive selection of local and non-local attention mechanisms for speech enhancement
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T02%3A54%3A06IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Adaptive%20selection%20of%20local%20and%20non-local%20attention%20mechanisms%20for%20speech%20enhancement&rft.jtitle=Neural%20networks&rft.au=Xu,%20Xinmeng&rft.date=2024-06-01&rft.volume=174&rft.spage=106236&rft.epage=106236&rft.pages=106236-106236&rft.artnum=106236&rft.issn=0893-6080&rft.eissn=1879-2782&rft_id=info:doi/10.1016/j.neunet.2024.106236&rft_dat=%3Cproquest_cross%3E2974004890%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c311t-13c09ff29a75ce24e8a9ee8b567e3fcf8b0aa695b4c5e81f03eb2ff836336d933%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2974004890&rft_id=info:pmid/38518710&rfr_iscdi=true