Loading…

Adaptive selection of local and non-local attention mechanisms for speech enhancement

In speech enhancement tasks, local and non-local attention mechanisms have been significantly improved and well studied. However, a natural speech signal contains many dynamic and fast-changing acoustic features, and focusing on one type of attention mechanism (local or non-local) cannot precisely c...

Full description

Saved in:
Bibliographic Details
Published in:Neural networks 2024-06, Vol.174, p.106236-106236, Article 106236
Main Authors: Xu, Xinmeng, Tu, Weiping, Yang, Yuhong
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites cdi_FETCH-LOGICAL-c311t-13c09ff29a75ce24e8a9ee8b567e3fcf8b0aa695b4c5e81f03eb2ff836336d933
container_end_page 106236
container_issue
container_start_page 106236
container_title Neural networks
container_volume 174
creator Xu, Xinmeng
Tu, Weiping
Yang, Yuhong
description In speech enhancement tasks, local and non-local attention mechanisms have been significantly improved and well studied. However, a natural speech signal contains many dynamic and fast-changing acoustic features, and focusing on one type of attention mechanism (local or non-local) cannot precisely capture the most discriminative information for estimating target speech from background interference. To address this issue, we introduce an adaptive selection network to dynamically select an appropriate route that determines whether to use the attention mechanisms and which to use for the task. We train the adaptive selection network using reinforcement learning with a developed difficulty-adjusted reward that is related to the performance, complexity, and difficulty of target speech estimation from the noisy mixtures. Consequently, we propose an Attention Selection Speech Enhancement Network (ASSENet) with the innovative dynamic block that consists of an adaptive selection network and a local and non-local attention based speech enhancement network. In particular, the ASSENet incorporates both local and non-local attention and develops the attention mechanism selection technique to explore the appropriate route of local and non-local attention mechanisms for speech enhancement tasks. The results show that our method achieves comparable and superior performance to existing approaches with attractive computational costs.
doi_str_mv 10.1016/j.neunet.2024.106236
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2974004890</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0893608024001606</els_id><sourcerecordid>2974004890</sourcerecordid><originalsourceid>FETCH-LOGICAL-c311t-13c09ff29a75ce24e8a9ee8b567e3fcf8b0aa695b4c5e81f03eb2ff836336d933</originalsourceid><addsrcrecordid>eNp9kE1LxDAQhoMoun78A5EevXSdJG2aXAQRv2DBi55Dmk4wS5usTVfw3xvt6tHTMC_PzDAPIecUlhSouFovA24DTksGrMqRYFzskQWVjSpZI9k-WYBUvBQg4Ygcp7QGACErfkiOuKwzR2FBXm86s5n8BxYJe7STj6GIruijNX1hQleEGMpdN00YfoAB7ZsJPg2pcHEs0gZzUGDIocUhQ6fkwJk-4dmunpDX-7uX28dy9fzwdHuzKi2ndCopt6CcY8o0tUVWoTQKUba1aJA762QLxghVt5WtUVIHHFvmnOSCc9Epzk_I5bx3M8b3LaZJDz5Z7HsTMG6TZqqpACqpIKPVjNoxpjSi05vRD2b81BT0t1C91rNQ_S1Uz0Lz2MXuwrYdsPsb-jWYgesZwPznh8dRJ-sxe-j8mH3qLvr_L3wBaZ-J4w</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2974004890</pqid></control><display><type>article</type><title>Adaptive selection of local and non-local attention mechanisms for speech enhancement</title><source>Elsevier</source><creator>Xu, Xinmeng ; Tu, Weiping ; Yang, Yuhong</creator><creatorcontrib>Xu, Xinmeng ; Tu, Weiping ; Yang, Yuhong</creatorcontrib><description>In speech enhancement tasks, local and non-local attention mechanisms have been significantly improved and well studied. However, a natural speech signal contains many dynamic and fast-changing acoustic features, and focusing on one type of attention mechanism (local or non-local) cannot precisely capture the most discriminative information for estimating target speech from background interference. To address this issue, we introduce an adaptive selection network to dynamically select an appropriate route that determines whether to use the attention mechanisms and which to use for the task. We train the adaptive selection network using reinforcement learning with a developed difficulty-adjusted reward that is related to the performance, complexity, and difficulty of target speech estimation from the noisy mixtures. Consequently, we propose an Attention Selection Speech Enhancement Network (ASSENet) with the innovative dynamic block that consists of an adaptive selection network and a local and non-local attention based speech enhancement network. In particular, the ASSENet incorporates both local and non-local attention and develops the attention mechanism selection technique to explore the appropriate route of local and non-local attention mechanisms for speech enhancement tasks. The results show that our method achieves comparable and superior performance to existing approaches with attractive computational costs.</description><identifier>ISSN: 0893-6080</identifier><identifier>EISSN: 1879-2782</identifier><identifier>DOI: 10.1016/j.neunet.2024.106236</identifier><identifier>PMID: 38518710</identifier><language>eng</language><publisher>United States: Elsevier Ltd</publisher><subject>Adaptive selection ; Difficulty-adjusted reward ; Local and non-local attention ; Reinforcement learning ; Speech enhancement</subject><ispartof>Neural networks, 2024-06, Vol.174, p.106236-106236, Article 106236</ispartof><rights>2024 Elsevier Ltd</rights><rights>Copyright © 2024 Elsevier Ltd. All rights reserved.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c311t-13c09ff29a75ce24e8a9ee8b567e3fcf8b0aa695b4c5e81f03eb2ff836336d933</cites><orcidid>0000-0002-6933-3298 ; 0000-0003-3001-7957</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27903,27904</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/38518710$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Xu, Xinmeng</creatorcontrib><creatorcontrib>Tu, Weiping</creatorcontrib><creatorcontrib>Yang, Yuhong</creatorcontrib><title>Adaptive selection of local and non-local attention mechanisms for speech enhancement</title><title>Neural networks</title><addtitle>Neural Netw</addtitle><description>In speech enhancement tasks, local and non-local attention mechanisms have been significantly improved and well studied. However, a natural speech signal contains many dynamic and fast-changing acoustic features, and focusing on one type of attention mechanism (local or non-local) cannot precisely capture the most discriminative information for estimating target speech from background interference. To address this issue, we introduce an adaptive selection network to dynamically select an appropriate route that determines whether to use the attention mechanisms and which to use for the task. We train the adaptive selection network using reinforcement learning with a developed difficulty-adjusted reward that is related to the performance, complexity, and difficulty of target speech estimation from the noisy mixtures. Consequently, we propose an Attention Selection Speech Enhancement Network (ASSENet) with the innovative dynamic block that consists of an adaptive selection network and a local and non-local attention based speech enhancement network. In particular, the ASSENet incorporates both local and non-local attention and develops the attention mechanism selection technique to explore the appropriate route of local and non-local attention mechanisms for speech enhancement tasks. The results show that our method achieves comparable and superior performance to existing approaches with attractive computational costs.</description><subject>Adaptive selection</subject><subject>Difficulty-adjusted reward</subject><subject>Local and non-local attention</subject><subject>Reinforcement learning</subject><subject>Speech enhancement</subject><issn>0893-6080</issn><issn>1879-2782</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kE1LxDAQhoMoun78A5EevXSdJG2aXAQRv2DBi55Dmk4wS5usTVfw3xvt6tHTMC_PzDAPIecUlhSouFovA24DTksGrMqRYFzskQWVjSpZI9k-WYBUvBQg4Ygcp7QGACErfkiOuKwzR2FBXm86s5n8BxYJe7STj6GIruijNX1hQleEGMpdN00YfoAB7ZsJPg2pcHEs0gZzUGDIocUhQ6fkwJk-4dmunpDX-7uX28dy9fzwdHuzKi2ndCopt6CcY8o0tUVWoTQKUba1aJA762QLxghVt5WtUVIHHFvmnOSCc9Epzk_I5bx3M8b3LaZJDz5Z7HsTMG6TZqqpACqpIKPVjNoxpjSi05vRD2b81BT0t1C91rNQ_S1Uz0Lz2MXuwrYdsPsb-jWYgesZwPznh8dRJ-sxe-j8mH3qLvr_L3wBaZ-J4w</recordid><startdate>20240601</startdate><enddate>20240601</enddate><creator>Xu, Xinmeng</creator><creator>Tu, Weiping</creator><creator>Yang, Yuhong</creator><general>Elsevier Ltd</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-6933-3298</orcidid><orcidid>https://orcid.org/0000-0003-3001-7957</orcidid></search><sort><creationdate>20240601</creationdate><title>Adaptive selection of local and non-local attention mechanisms for speech enhancement</title><author>Xu, Xinmeng ; Tu, Weiping ; Yang, Yuhong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c311t-13c09ff29a75ce24e8a9ee8b567e3fcf8b0aa695b4c5e81f03eb2ff836336d933</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Adaptive selection</topic><topic>Difficulty-adjusted reward</topic><topic>Local and non-local attention</topic><topic>Reinforcement learning</topic><topic>Speech enhancement</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Xu, Xinmeng</creatorcontrib><creatorcontrib>Tu, Weiping</creatorcontrib><creatorcontrib>Yang, Yuhong</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Neural networks</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Xu, Xinmeng</au><au>Tu, Weiping</au><au>Yang, Yuhong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Adaptive selection of local and non-local attention mechanisms for speech enhancement</atitle><jtitle>Neural networks</jtitle><addtitle>Neural Netw</addtitle><date>2024-06-01</date><risdate>2024</risdate><volume>174</volume><spage>106236</spage><epage>106236</epage><pages>106236-106236</pages><artnum>106236</artnum><issn>0893-6080</issn><eissn>1879-2782</eissn><abstract>In speech enhancement tasks, local and non-local attention mechanisms have been significantly improved and well studied. However, a natural speech signal contains many dynamic and fast-changing acoustic features, and focusing on one type of attention mechanism (local or non-local) cannot precisely capture the most discriminative information for estimating target speech from background interference. To address this issue, we introduce an adaptive selection network to dynamically select an appropriate route that determines whether to use the attention mechanisms and which to use for the task. We train the adaptive selection network using reinforcement learning with a developed difficulty-adjusted reward that is related to the performance, complexity, and difficulty of target speech estimation from the noisy mixtures. Consequently, we propose an Attention Selection Speech Enhancement Network (ASSENet) with the innovative dynamic block that consists of an adaptive selection network and a local and non-local attention based speech enhancement network. In particular, the ASSENet incorporates both local and non-local attention and develops the attention mechanism selection technique to explore the appropriate route of local and non-local attention mechanisms for speech enhancement tasks. The results show that our method achieves comparable and superior performance to existing approaches with attractive computational costs.</abstract><cop>United States</cop><pub>Elsevier Ltd</pub><pmid>38518710</pmid><doi>10.1016/j.neunet.2024.106236</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0002-6933-3298</orcidid><orcidid>https://orcid.org/0000-0003-3001-7957</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0893-6080
ispartof Neural networks, 2024-06, Vol.174, p.106236-106236, Article 106236
issn 0893-6080
1879-2782
language eng
recordid cdi_proquest_miscellaneous_2974004890
source Elsevier
subjects Adaptive selection
Difficulty-adjusted reward
Local and non-local attention
Reinforcement learning
Speech enhancement
title Adaptive selection of local and non-local attention mechanisms for speech enhancement
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T02%3A54%3A06IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Adaptive%20selection%20of%20local%20and%20non-local%20attention%20mechanisms%20for%20speech%20enhancement&rft.jtitle=Neural%20networks&rft.au=Xu,%20Xinmeng&rft.date=2024-06-01&rft.volume=174&rft.spage=106236&rft.epage=106236&rft.pages=106236-106236&rft.artnum=106236&rft.issn=0893-6080&rft.eissn=1879-2782&rft_id=info:doi/10.1016/j.neunet.2024.106236&rft_dat=%3Cproquest_cross%3E2974004890%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c311t-13c09ff29a75ce24e8a9ee8b567e3fcf8b0aa695b4c5e81f03eb2ff836336d933%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2974004890&rft_id=info:pmid/38518710&rfr_iscdi=true