Loading…

Learning to Match Transient Sound Events Using Attentional Similarity for Few-shot Sound Recognition

In this paper, we introduce a novel attentional similarity module for the problem of few-shot sound recognition. Given a few examples of an unseen sound event, a classifier must be quickly adapted to recognize the new sound event without much fine-tuning. The proposed attentional similarity module c...

Full description

Saved in:

Bibliographic Details
Main Authors:	Chou, Szu-Yu, Cheng, Kai-Hsiang, Jang, Jyh-Shing Roger, Yang, Yi-Hsuan
Format:	Conference Proceeding
Language:	English
Subjects:	deep learning Feature extraction Few-shot learning Image color analysis Learning systems Noise measurement sound event detection Task analysis Training Transient analysis transient sound event
Citations:	Items that cite this one
Online Access:	Request full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by	cdi_FETCH-LOGICAL-c225t-81ba5dac4b74e383665fd18f3b1288456584e7580958055ac5e51a7f9c1652eb3
cites
container_end_page	30
container_issue
container_start_page	26
container_title
container_volume
creator	Chou, Szu-Yu Cheng, Kai-Hsiang Jang, Jyh-Shing Roger Yang, Yi-Hsuan
description	In this paper, we introduce a novel attentional similarity module for the problem of few-shot sound recognition. Given a few examples of an unseen sound event, a classifier must be quickly adapted to recognize the new sound event without much fine-tuning. The proposed attentional similarity module can be plugged into any metric-based learning method for few-shot learning, allowing the resulting model to especially match related short sound events. Extensive experiments on two datasets show that the proposed module consistently improves the performance of five different metric-based learning methods for few-shot sound recognition. The relative improvement ranges from +4.1% to +7.7% for 5-shot 5-way accuracy for the ESC-50 dataset, and from +2.1% to +6.5% for noiseESC-50. Qualitative results demonstrate that our method contributes in particular to the recognition of transient sound events.
doi_str_mv	10.1109/ICASSP.2019.8682558
format	conference_proceeding
fullrecord	<record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_8682558</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8682558</ieee_id><sourcerecordid>8682558</sourcerecordid><originalsourceid>FETCH-LOGICAL-c225t-81ba5dac4b74e383665fd18f3b1288456584e7580958055ac5e51a7f9c1652eb3</originalsourceid><addsrcrecordid>eNo1UNFKwzAUjYLgnPuCveQHWnOTprl5HGNTYaLYDXwbaZtukS6VJir7-3U4Hw7nHjjncDmETIGlAEw_PM9nRfGWcgY6xRy5lHhFJlohZEprBAFwTUZcKJ2AZh-35C6ET8YYqgxHpF5Z03vndzR29MXEak_XvfHBWR9p0X37mi5-hjvQTTi7ZjEOynXetLRwB9ea3sUjbbqeLu1vEvbdf-zdVt3Ou7P3ntw0pg12cuEx2SwX6_lTsnp9HN5fJRXnMiYIpZG1qbJSZVagyHPZ1ICNKIEjZjKXmFklkekBUppKWglGNbqCXHJbijGZ_vU6a-32q3cH0x-3l1HECdG1VxA</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Learning to Match Transient Sound Events Using Attentional Similarity for Few-shot Sound Recognition</title><source>IEEE Xplore All Conference Series</source><creator>Chou, Szu-Yu ; Cheng, Kai-Hsiang ; Jang, Jyh-Shing Roger ; Yang, Yi-Hsuan</creator><creatorcontrib>Chou, Szu-Yu ; Cheng, Kai-Hsiang ; Jang, Jyh-Shing Roger ; Yang, Yi-Hsuan</creatorcontrib><description>In this paper, we introduce a novel attentional similarity module for the problem of few-shot sound recognition. Given a few examples of an unseen sound event, a classifier must be quickly adapted to recognize the new sound event without much fine-tuning. The proposed attentional similarity module can be plugged into any metric-based learning method for few-shot learning, allowing the resulting model to especially match related short sound events. Extensive experiments on two datasets show that the proposed module consistently improves the performance of five different metric-based learning methods for few-shot sound recognition. The relative improvement ranges from +4.1% to +7.7% for 5-shot 5-way accuracy for the ESC-50 dataset, and from +2.1% to +6.5% for noiseESC-50. Qualitative results demonstrate that our method contributes in particular to the recognition of transient sound events.</description><identifier>EISSN: 2379-190X</identifier><identifier>EISBN: 9781479981311</identifier><identifier>EISBN: 1479981311</identifier><identifier>DOI: 10.1109/ICASSP.2019.8682558</identifier><language>eng</language><publisher>IEEE</publisher><subject>deep learning ; Feature extraction ; Few-shot learning ; Image color analysis ; Learning systems ; Noise measurement ; sound event detection ; Task analysis ; Training ; Transient analysis ; transient sound event</subject><ispartof>ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019, p.26-30</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c225t-81ba5dac4b74e383665fd18f3b1288456584e7580958055ac5e51a7f9c1652eb3</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8682558$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,27925,54555,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8682558$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Chou, Szu-Yu</creatorcontrib><creatorcontrib>Cheng, Kai-Hsiang</creatorcontrib><creatorcontrib>Jang, Jyh-Shing Roger</creatorcontrib><creatorcontrib>Yang, Yi-Hsuan</creatorcontrib><title>Learning to Match Transient Sound Events Using Attentional Similarity for Few-shot Sound Recognition</title><title>ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</title><addtitle>ICASSP</addtitle><description>In this paper, we introduce a novel attentional similarity module for the problem of few-shot sound recognition. Given a few examples of an unseen sound event, a classifier must be quickly adapted to recognize the new sound event without much fine-tuning. The proposed attentional similarity module can be plugged into any metric-based learning method for few-shot learning, allowing the resulting model to especially match related short sound events. Extensive experiments on two datasets show that the proposed module consistently improves the performance of five different metric-based learning methods for few-shot sound recognition. The relative improvement ranges from +4.1% to +7.7% for 5-shot 5-way accuracy for the ESC-50 dataset, and from +2.1% to +6.5% for noiseESC-50. Qualitative results demonstrate that our method contributes in particular to the recognition of transient sound events.</description><subject>deep learning</subject><subject>Feature extraction</subject><subject>Few-shot learning</subject><subject>Image color analysis</subject><subject>Learning systems</subject><subject>Noise measurement</subject><subject>sound event detection</subject><subject>Task analysis</subject><subject>Training</subject><subject>Transient analysis</subject><subject>transient sound event</subject><issn>2379-190X</issn><isbn>9781479981311</isbn><isbn>1479981311</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2019</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNo1UNFKwzAUjYLgnPuCveQHWnOTprl5HGNTYaLYDXwbaZtukS6VJir7-3U4Hw7nHjjncDmETIGlAEw_PM9nRfGWcgY6xRy5lHhFJlohZEprBAFwTUZcKJ2AZh-35C6ET8YYqgxHpF5Z03vndzR29MXEak_XvfHBWR9p0X37mi5-hjvQTTi7ZjEOynXetLRwB9ea3sUjbbqeLu1vEvbdf-zdVt3Ou7P3ntw0pg12cuEx2SwX6_lTsnp9HN5fJRXnMiYIpZG1qbJSZVagyHPZ1ICNKIEjZjKXmFklkekBUppKWglGNbqCXHJbijGZ_vU6a-32q3cH0x-3l1HECdG1VxA</recordid><startdate>201905</startdate><enddate>201905</enddate><creator>Chou, Szu-Yu</creator><creator>Cheng, Kai-Hsiang</creator><creator>Jang, Jyh-Shing Roger</creator><creator>Yang, Yi-Hsuan</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope></search><sort><creationdate>201905</creationdate><title>Learning to Match Transient Sound Events Using Attentional Similarity for Few-shot Sound Recognition</title><author>Chou, Szu-Yu ; Cheng, Kai-Hsiang ; Jang, Jyh-Shing Roger ; Yang, Yi-Hsuan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c225t-81ba5dac4b74e383665fd18f3b1288456584e7580958055ac5e51a7f9c1652eb3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2019</creationdate><topic>deep learning</topic><topic>Feature extraction</topic><topic>Few-shot learning</topic><topic>Image color analysis</topic><topic>Learning systems</topic><topic>Noise measurement</topic><topic>sound event detection</topic><topic>Task analysis</topic><topic>Training</topic><topic>Transient analysis</topic><topic>transient sound event</topic><toplevel>online_resources</toplevel><creatorcontrib>Chou, Szu-Yu</creatorcontrib><creatorcontrib>Cheng, Kai-Hsiang</creatorcontrib><creatorcontrib>Jang, Jyh-Shing Roger</creatorcontrib><creatorcontrib>Yang, Yi-Hsuan</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE/IET Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Chou, Szu-Yu</au><au>Cheng, Kai-Hsiang</au><au>Jang, Jyh-Shing Roger</au><au>Yang, Yi-Hsuan</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Learning to Match Transient Sound Events Using Attentional Similarity for Few-shot Sound Recognition</atitle><btitle>ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</btitle><stitle>ICASSP</stitle><date>2019-05</date><risdate>2019</risdate><spage>26</spage><epage>30</epage><pages>26-30</pages><eissn>2379-190X</eissn><eisbn>9781479981311</eisbn><eisbn>1479981311</eisbn><abstract>In this paper, we introduce a novel attentional similarity module for the problem of few-shot sound recognition. Given a few examples of an unseen sound event, a classifier must be quickly adapted to recognize the new sound event without much fine-tuning. The proposed attentional similarity module can be plugged into any metric-based learning method for few-shot learning, allowing the resulting model to especially match related short sound events. Extensive experiments on two datasets show that the proposed module consistently improves the performance of five different metric-based learning methods for few-shot sound recognition. The relative improvement ranges from +4.1% to +7.7% for 5-shot 5-way accuracy for the ESC-50 dataset, and from +2.1% to +6.5% for noiseESC-50. Qualitative results demonstrate that our method contributes in particular to the recognition of transient sound events.</abstract><pub>IEEE</pub><doi>10.1109/ICASSP.2019.8682558</doi><tpages>5</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	EISSN: 2379-190X
ispartof	ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019, p.26-30
issn	2379-190X
language	eng
recordid	cdi_ieee_primary_8682558
source	IEEE Xplore All Conference Series
subjects	deep learning Feature extraction Few-shot learning Image color analysis Learning systems Noise measurement sound event detection Task analysis Training Transient analysis transient sound event
title	Learning to Match Transient Sound Events Using Attentional Similarity for Few-shot Sound Recognition
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T12%3A56%3A50IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Learning%20to%20Match%20Transient%20Sound%20Events%20Using%20Attentional%20Similarity%20for%20Few-shot%20Sound%20Recognition&rft.btitle=ICASSP%202019%20-%202019%20IEEE%20International%20Conference%20on%20Acoustics,%20Speech%20and%20Signal%20Processing%20(ICASSP)&rft.au=Chou,%20Szu-Yu&rft.date=2019-05&rft.spage=26&rft.epage=30&rft.pages=26-30&rft.eissn=2379-190X&rft_id=info:doi/10.1109/ICASSP.2019.8682558&rft.eisbn=9781479981311&rft.eisbn_list=1479981311&rft_dat=%3Cieee_CHZPO%3E8682558%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c225t-81ba5dac4b74e383665fd18f3b1288456584e7580958055ac5e51a7f9c1652eb3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=8682558&rfr_iscdi=true