Loading…

AURA: Privacy-Preserving Augmentation to Improve Test Set Diversity in Speech Enhancement

Speech enhancement models running in production environments are commonly trained on publicly available data. This approach leads to regressions due to the lack of training/testing on representative customer data. Moreover, due to privacy reasons, developers cannot listen to customer content. This &...

Full description

Saved in:
Bibliographic Details
Main Authors: Gitiaux, Xavier, Khant, Aditya, Beyrami, Ebrahim, Reddy, Chandan, Gupchup, Jayant, Cutler, Ross
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 5
container_issue
container_start_page 1
container_title
container_volume
creator Gitiaux, Xavier
Khant, Aditya
Beyrami, Ebrahim
Reddy, Chandan
Gupchup, Jayant
Cutler, Ross
description Speech enhancement models running in production environments are commonly trained on publicly available data. This approach leads to regressions due to the lack of training/testing on representative customer data. Moreover, due to privacy reasons, developers cannot listen to customer content. This 'ears-off' situation motivates Aura, an end-to-end solution to make existing speech enhancement train and test sets more challenging and diverse while being sample efficient. Aura is 'ears-off' because it relies on a feature extractor and metrics of speech quality, DNSMOS P.835, and AECMOS, that are pre-trained on data obtained from public sources. We evalaute Aura on two speech enhancement tasks: noise suppression (NS) and audio echo cancellation (AEC). Aura samples an NS test set 0.42 harder in terms of P.835 OVRL than random sampling; and, an AEC test set 1.93 harder in AECMOS. Moreover, Aura increases diversity by 30% for NS tasks and by 530% for AEC tasks compared to greedy sampling. Moreover, Aura achieves a 26% improvement in Spearman's rank correlation coefficient (SRCC) compared to random sampling when used to stack rank NS models.
doi_str_mv 10.1109/ICASSP49357.2023.10096879
format conference_proceeding
fullrecord <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_10096879</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10096879</ieee_id><sourcerecordid>10096879</sourcerecordid><originalsourceid>FETCH-LOGICAL-i1709-f01be3b6c1cc751538b4f9e09384667854db4b93d63d0344b9fd0bf6b11fdd4d3</originalsourceid><addsrcrecordid>eNo1kM1Kw0AUhUdBsK2-gYvxAVLvzUzmx12oVQsFi2lBVyWTuWlHbFomMdC3t6Kuztl8h4_D2C3CGBHs3WySF8VCWpHpcQqpGCOAVUbbMzZEnRpUItX6nA1SoW2CFt4u2bBtPwDAaGkG7D1fveb3fBFDX1bHZBGppdiHZsPzr82Omq7swr7h3Z7Pdoe474kvqe14QR1_CD3FNnRHHhpeHIiqLZ8227Kp6Ae8Yhd1-dnS9V-O2Opxupw8J_OXp5P2PAmowSY1oCPhVIVVpTPMhHGytgRWGKmUNpn0TjorvBIehDzV2oOrlUOsvZdejNjN724govUhhl0Zj-v_H8Q3NlZTzg</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>AURA: Privacy-Preserving Augmentation to Improve Test Set Diversity in Speech Enhancement</title><source>IEEE Xplore All Conference Series</source><creator>Gitiaux, Xavier ; Khant, Aditya ; Beyrami, Ebrahim ; Reddy, Chandan ; Gupchup, Jayant ; Cutler, Ross</creator><creatorcontrib>Gitiaux, Xavier ; Khant, Aditya ; Beyrami, Ebrahim ; Reddy, Chandan ; Gupchup, Jayant ; Cutler, Ross</creatorcontrib><description>Speech enhancement models running in production environments are commonly trained on publicly available data. This approach leads to regressions due to the lack of training/testing on representative customer data. Moreover, due to privacy reasons, developers cannot listen to customer content. This 'ears-off' situation motivates Aura, an end-to-end solution to make existing speech enhancement train and test sets more challenging and diverse while being sample efficient. Aura is 'ears-off' because it relies on a feature extractor and metrics of speech quality, DNSMOS P.835, and AECMOS, that are pre-trained on data obtained from public sources. We evalaute Aura on two speech enhancement tasks: noise suppression (NS) and audio echo cancellation (AEC). Aura samples an NS test set 0.42 harder in terms of P.835 OVRL than random sampling; and, an AEC test set 1.93 harder in AECMOS. Moreover, Aura increases diversity by 30% for NS tasks and by 530% for AEC tasks compared to greedy sampling. Moreover, Aura achieves a 26% improvement in Spearman's rank correlation coefficient (SRCC) compared to random sampling when used to stack rank NS models.</description><identifier>EISSN: 2379-190X</identifier><identifier>EISBN: 1728163277</identifier><identifier>EISBN: 9781728163277</identifier><identifier>DOI: 10.1109/ICASSP49357.2023.10096879</identifier><language>eng</language><publisher>IEEE</publisher><subject>Echo cancellers ; Noise reduction ; Ontologies ; Privacy ; Production ; Signal processing ; Speech enhancement ; test set</subject><ispartof>ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023, p.1-5</ispartof><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10096879$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,4050,4051,23930,23931,25140,27925,54555,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10096879$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Gitiaux, Xavier</creatorcontrib><creatorcontrib>Khant, Aditya</creatorcontrib><creatorcontrib>Beyrami, Ebrahim</creatorcontrib><creatorcontrib>Reddy, Chandan</creatorcontrib><creatorcontrib>Gupchup, Jayant</creatorcontrib><creatorcontrib>Cutler, Ross</creatorcontrib><title>AURA: Privacy-Preserving Augmentation to Improve Test Set Diversity in Speech Enhancement</title><title>ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</title><addtitle>ICASSP</addtitle><description>Speech enhancement models running in production environments are commonly trained on publicly available data. This approach leads to regressions due to the lack of training/testing on representative customer data. Moreover, due to privacy reasons, developers cannot listen to customer content. This 'ears-off' situation motivates Aura, an end-to-end solution to make existing speech enhancement train and test sets more challenging and diverse while being sample efficient. Aura is 'ears-off' because it relies on a feature extractor and metrics of speech quality, DNSMOS P.835, and AECMOS, that are pre-trained on data obtained from public sources. We evalaute Aura on two speech enhancement tasks: noise suppression (NS) and audio echo cancellation (AEC). Aura samples an NS test set 0.42 harder in terms of P.835 OVRL than random sampling; and, an AEC test set 1.93 harder in AECMOS. Moreover, Aura increases diversity by 30% for NS tasks and by 530% for AEC tasks compared to greedy sampling. Moreover, Aura achieves a 26% improvement in Spearman's rank correlation coefficient (SRCC) compared to random sampling when used to stack rank NS models.</description><subject>Echo cancellers</subject><subject>Noise reduction</subject><subject>Ontologies</subject><subject>Privacy</subject><subject>Production</subject><subject>Signal processing</subject><subject>Speech enhancement</subject><subject>test set</subject><issn>2379-190X</issn><isbn>1728163277</isbn><isbn>9781728163277</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2023</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNo1kM1Kw0AUhUdBsK2-gYvxAVLvzUzmx12oVQsFi2lBVyWTuWlHbFomMdC3t6Kuztl8h4_D2C3CGBHs3WySF8VCWpHpcQqpGCOAVUbbMzZEnRpUItX6nA1SoW2CFt4u2bBtPwDAaGkG7D1fveb3fBFDX1bHZBGppdiHZsPzr82Omq7swr7h3Z7Pdoe474kvqe14QR1_CD3FNnRHHhpeHIiqLZ8227Kp6Ae8Yhd1-dnS9V-O2Opxupw8J_OXp5P2PAmowSY1oCPhVIVVpTPMhHGytgRWGKmUNpn0TjorvBIehDzV2oOrlUOsvZdejNjN724govUhhl0Zj-v_H8Q3NlZTzg</recordid><startdate>2023</startdate><enddate>2023</enddate><creator>Gitiaux, Xavier</creator><creator>Khant, Aditya</creator><creator>Beyrami, Ebrahim</creator><creator>Reddy, Chandan</creator><creator>Gupchup, Jayant</creator><creator>Cutler, Ross</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope></search><sort><creationdate>2023</creationdate><title>AURA: Privacy-Preserving Augmentation to Improve Test Set Diversity in Speech Enhancement</title><author>Gitiaux, Xavier ; Khant, Aditya ; Beyrami, Ebrahim ; Reddy, Chandan ; Gupchup, Jayant ; Cutler, Ross</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i1709-f01be3b6c1cc751538b4f9e09384667854db4b93d63d0344b9fd0bf6b11fdd4d3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Echo cancellers</topic><topic>Noise reduction</topic><topic>Ontologies</topic><topic>Privacy</topic><topic>Production</topic><topic>Signal processing</topic><topic>Speech enhancement</topic><topic>test set</topic><toplevel>online_resources</toplevel><creatorcontrib>Gitiaux, Xavier</creatorcontrib><creatorcontrib>Khant, Aditya</creatorcontrib><creatorcontrib>Beyrami, Ebrahim</creatorcontrib><creatorcontrib>Reddy, Chandan</creatorcontrib><creatorcontrib>Gupchup, Jayant</creatorcontrib><creatorcontrib>Cutler, Ross</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Xplore</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Gitiaux, Xavier</au><au>Khant, Aditya</au><au>Beyrami, Ebrahim</au><au>Reddy, Chandan</au><au>Gupchup, Jayant</au><au>Cutler, Ross</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>AURA: Privacy-Preserving Augmentation to Improve Test Set Diversity in Speech Enhancement</atitle><btitle>ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</btitle><stitle>ICASSP</stitle><date>2023</date><risdate>2023</risdate><spage>1</spage><epage>5</epage><pages>1-5</pages><eissn>2379-190X</eissn><eisbn>1728163277</eisbn><eisbn>9781728163277</eisbn><abstract>Speech enhancement models running in production environments are commonly trained on publicly available data. This approach leads to regressions due to the lack of training/testing on representative customer data. Moreover, due to privacy reasons, developers cannot listen to customer content. This 'ears-off' situation motivates Aura, an end-to-end solution to make existing speech enhancement train and test sets more challenging and diverse while being sample efficient. Aura is 'ears-off' because it relies on a feature extractor and metrics of speech quality, DNSMOS P.835, and AECMOS, that are pre-trained on data obtained from public sources. We evalaute Aura on two speech enhancement tasks: noise suppression (NS) and audio echo cancellation (AEC). Aura samples an NS test set 0.42 harder in terms of P.835 OVRL than random sampling; and, an AEC test set 1.93 harder in AECMOS. Moreover, Aura increases diversity by 30% for NS tasks and by 530% for AEC tasks compared to greedy sampling. Moreover, Aura achieves a 26% improvement in Spearman's rank correlation coefficient (SRCC) compared to random sampling when used to stack rank NS models.</abstract><pub>IEEE</pub><doi>10.1109/ICASSP49357.2023.10096879</doi><tpages>5</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier EISSN: 2379-190X
ispartof ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023, p.1-5
issn 2379-190X
language eng
recordid cdi_ieee_primary_10096879
source IEEE Xplore All Conference Series
subjects Echo cancellers
Noise reduction
Ontologies
Privacy
Production
Signal processing
Speech enhancement
test set
title AURA: Privacy-Preserving Augmentation to Improve Test Set Diversity in Speech Enhancement
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T11%3A46%3A51IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=AURA:%20Privacy-Preserving%20Augmentation%20to%20Improve%20Test%20Set%20Diversity%20in%20Speech%20Enhancement&rft.btitle=ICASSP%202023%20-%202023%20IEEE%20International%20Conference%20on%20Acoustics,%20Speech%20and%20Signal%20Processing%20(ICASSP)&rft.au=Gitiaux,%20Xavier&rft.date=2023&rft.spage=1&rft.epage=5&rft.pages=1-5&rft.eissn=2379-190X&rft_id=info:doi/10.1109/ICASSP49357.2023.10096879&rft.eisbn=1728163277&rft.eisbn_list=9781728163277&rft_dat=%3Cieee_CHZPO%3E10096879%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i1709-f01be3b6c1cc751538b4f9e09384667854db4b93d63d0344b9fd0bf6b11fdd4d3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10096879&rfr_iscdi=true