Loading…

Data-Dependent Feature Extraction Method Based on Non-Negative Matrix Factorization for Weakly Supervised Domestic Sound Event Detection

In this paper, feature extraction methods are developed based on the non-negative matrix factorization (NMF) algorithm to be applied in weakly supervised sound event detection. Recently, the development of various features and systems have been attempted to tackle the problems of acoustic scene clas...

Full description

Saved in:

Bibliographic Details
Published in:	Applied sciences 2021-01, Vol.11 (3), p.1040
Main Authors:	Lee, Seokjin, Kim, Minhan, Shin, Seunghyeon, Park, Sooyoung, Jeong, Youngho
Format:	Article
Language:	English
Subjects:	Acoustics Artificial intelligence Classification Datasets Factorization feature extraction Machine learning Methods Neural networks Noise non-negative matrix factorization Post-production processing Principal components analysis Signal processing Sound sound event detection Speech
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by	cdi_FETCH-LOGICAL-c392t-c2c15d97efc71c4ff08a9dabcc969e90449ac8eb031f3dea64751be4d9f1976a3
cites	cdi_FETCH-LOGICAL-c392t-c2c15d97efc71c4ff08a9dabcc969e90449ac8eb031f3dea64751be4d9f1976a3
container_end_page
container_issue	3
container_start_page	1040
container_title	Applied sciences
container_volume	11
creator	Lee, Seokjin Kim, Minhan Shin, Seunghyeon Park, Sooyoung Jeong, Youngho
description	In this paper, feature extraction methods are developed based on the non-negative matrix factorization (NMF) algorithm to be applied in weakly supervised sound event detection. Recently, the development of various features and systems have been attempted to tackle the problems of acoustic scene classification and sound event detection. However, most of these systems use data-independent spectral features, e.g., Mel-spectrogram, log-Mel-spectrum, and gammatone filterbank. Some data-dependent feature extraction methods, including the NMF-based methods, recently demonstrated the potential to tackle the problems mentioned above for long-term acoustic signals. In this paper, we further develop the recently proposed NMF-based feature extraction method to enable its application in weakly supervised sound event detection. To achieve this goal, we develop a strategy for training the frequency basis matrix using a heterogeneous database consisting of strongly- and weakly-labeled data. Moreover, we develop a non-iterative version of the NMF-based feature extraction method so that the proposed feature extraction method can be applied as a part of the model structure similar to the modern “on-the-fly” transform method for the Mel-spectrogram. To detect the sound events, the temporal basis is calculated using the NMF method and then used as a feature for the mean-teacher-model-based classifier. The results are improved for the event-wise post-processing method. To evaluate the proposed system, simulations of the weakly supervised sound event detection were conducted using the Detection and Classification of Acoustic Scenes and Events 2020 Task 4 database. The results reveal that the proposed system has F1-score performance comparable with the Mel-spectrogram and gammatonegram and exhibits 3–5% better performance than the log-Mel-spectrum and constant-Q transform.
doi_str_mv	10.3390/app11031040
format	article
fullrecord	<record><control><sourceid>proquest_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_0810b23424d442828d9ad417ba3feea9</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><doaj_id>oai_doaj_org_article_0810b23424d442828d9ad417ba3feea9</doaj_id><sourcerecordid>2524471960</sourcerecordid><originalsourceid>FETCH-LOGICAL-c392t-c2c15d97efc71c4ff08a9dabcc969e90449ac8eb031f3dea64751be4d9f1976a3</originalsourceid><addsrcrecordid>eNp9UU1PGzEQXSEqgSgn_oAljtVSf2V3fSwkoUghPQDiaM3aY9gQ1ovXGyX9Bf3ZdRJU5VRf7LHevPfmTZZdMHolhKLfoesYo4JRSY-yU07LIheSlccH75PsvO8XNB3FRMXoafZnDBHyMXbYWmwjmSLEISCZrGMAExvfknuMr96Sa-jRklTPfZvP8QVis0JyDzE0azJNWB-a37DrcD6QZ4S35YY8DB2GVbNtHft37GNjyIMfWksmq63eGCPuZL5mXxwsezz_vM-yp-nk8eZnPvt1e3fzY5YboXjMDTdsZFWJzpTMSOdoBcpCbYwqFCoqpQJTYZ1ycMIiFLIcsRqlVY6psgBxlt3tea2Hhe5C8w5hoz00evfhw4uGkFwuUdOUUM2F5NJKySteWQU2pViDcIigEtflnqsL_mNIw-mFH0Kb7Gs-4lKWTBX0vyhZCTFSnLOE-rZHmeD7PqD7541Rvd2vPtiv-As70JjE</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2483359221</pqid></control><display><type>article</type><title>Data-Dependent Feature Extraction Method Based on Non-Negative Matrix Factorization for Weakly Supervised Domestic Sound Event Detection</title><source>Publicly Available Content (ProQuest)</source><creator>Lee, Seokjin ; Kim, Minhan ; Shin, Seunghyeon ; Park, Sooyoung ; Jeong, Youngho</creator><creatorcontrib>Lee, Seokjin ; Kim, Minhan ; Shin, Seunghyeon ; Park, Sooyoung ; Jeong, Youngho</creatorcontrib><description>In this paper, feature extraction methods are developed based on the non-negative matrix factorization (NMF) algorithm to be applied in weakly supervised sound event detection. Recently, the development of various features and systems have been attempted to tackle the problems of acoustic scene classification and sound event detection. However, most of these systems use data-independent spectral features, e.g., Mel-spectrogram, log-Mel-spectrum, and gammatone filterbank. Some data-dependent feature extraction methods, including the NMF-based methods, recently demonstrated the potential to tackle the problems mentioned above for long-term acoustic signals. In this paper, we further develop the recently proposed NMF-based feature extraction method to enable its application in weakly supervised sound event detection. To achieve this goal, we develop a strategy for training the frequency basis matrix using a heterogeneous database consisting of strongly- and weakly-labeled data. Moreover, we develop a non-iterative version of the NMF-based feature extraction method so that the proposed feature extraction method can be applied as a part of the model structure similar to the modern “on-the-fly” transform method for the Mel-spectrogram. To detect the sound events, the temporal basis is calculated using the NMF method and then used as a feature for the mean-teacher-model-based classifier. The results are improved for the event-wise post-processing method. To evaluate the proposed system, simulations of the weakly supervised sound event detection were conducted using the Detection and Classification of Acoustic Scenes and Events 2020 Task 4 database. The results reveal that the proposed system has F1-score performance comparable with the Mel-spectrogram and gammatonegram and exhibits 3–5% better performance than the log-Mel-spectrum and constant-Q transform.</description><identifier>ISSN: 2076-3417</identifier><identifier>EISSN: 2076-3417</identifier><identifier>DOI: 10.3390/app11031040</identifier><language>eng</language><publisher>Basel: MDPI AG</publisher><subject>Acoustics ; Artificial intelligence ; Classification ; Datasets ; Factorization ; feature extraction ; Machine learning ; Methods ; Neural networks ; Noise ; non-negative matrix factorization ; Post-production processing ; Principal components analysis ; Signal processing ; Sound ; sound event detection ; Speech</subject><ispartof>Applied sciences, 2021-01, Vol.11 (3), p.1040</ispartof><rights>2021. This work is licensed under http://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c392t-c2c15d97efc71c4ff08a9dabcc969e90449ac8eb031f3dea64751be4d9f1976a3</citedby><cites>FETCH-LOGICAL-c392t-c2c15d97efc71c4ff08a9dabcc969e90449ac8eb031f3dea64751be4d9f1976a3</cites><orcidid>0000-0001-8220-192X ; 0000-0001-9552-8593</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/2483359221/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2483359221?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,25731,27901,27902,36989,44566,74869</link.rule.ids></links><search><creatorcontrib>Lee, Seokjin</creatorcontrib><creatorcontrib>Kim, Minhan</creatorcontrib><creatorcontrib>Shin, Seunghyeon</creatorcontrib><creatorcontrib>Park, Sooyoung</creatorcontrib><creatorcontrib>Jeong, Youngho</creatorcontrib><title>Data-Dependent Feature Extraction Method Based on Non-Negative Matrix Factorization for Weakly Supervised Domestic Sound Event Detection</title><title>Applied sciences</title><description>In this paper, feature extraction methods are developed based on the non-negative matrix factorization (NMF) algorithm to be applied in weakly supervised sound event detection. Recently, the development of various features and systems have been attempted to tackle the problems of acoustic scene classification and sound event detection. However, most of these systems use data-independent spectral features, e.g., Mel-spectrogram, log-Mel-spectrum, and gammatone filterbank. Some data-dependent feature extraction methods, including the NMF-based methods, recently demonstrated the potential to tackle the problems mentioned above for long-term acoustic signals. In this paper, we further develop the recently proposed NMF-based feature extraction method to enable its application in weakly supervised sound event detection. To achieve this goal, we develop a strategy for training the frequency basis matrix using a heterogeneous database consisting of strongly- and weakly-labeled data. Moreover, we develop a non-iterative version of the NMF-based feature extraction method so that the proposed feature extraction method can be applied as a part of the model structure similar to the modern “on-the-fly” transform method for the Mel-spectrogram. To detect the sound events, the temporal basis is calculated using the NMF method and then used as a feature for the mean-teacher-model-based classifier. The results are improved for the event-wise post-processing method. To evaluate the proposed system, simulations of the weakly supervised sound event detection were conducted using the Detection and Classification of Acoustic Scenes and Events 2020 Task 4 database. The results reveal that the proposed system has F1-score performance comparable with the Mel-spectrogram and gammatonegram and exhibits 3–5% better performance than the log-Mel-spectrum and constant-Q transform.</description><subject>Acoustics</subject><subject>Artificial intelligence</subject><subject>Classification</subject><subject>Datasets</subject><subject>Factorization</subject><subject>feature extraction</subject><subject>Machine learning</subject><subject>Methods</subject><subject>Neural networks</subject><subject>Noise</subject><subject>non-negative matrix factorization</subject><subject>Post-production processing</subject><subject>Principal components analysis</subject><subject>Signal processing</subject><subject>Sound</subject><subject>sound event detection</subject><subject>Speech</subject><issn>2076-3417</issn><issn>2076-3417</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><sourceid>DOA</sourceid><recordid>eNp9UU1PGzEQXSEqgSgn_oAljtVSf2V3fSwkoUghPQDiaM3aY9gQ1ovXGyX9Bf3ZdRJU5VRf7LHevPfmTZZdMHolhKLfoesYo4JRSY-yU07LIheSlccH75PsvO8XNB3FRMXoafZnDBHyMXbYWmwjmSLEISCZrGMAExvfknuMr96Sa-jRklTPfZvP8QVis0JyDzE0azJNWB-a37DrcD6QZ4S35YY8DB2GVbNtHft37GNjyIMfWksmq63eGCPuZL5mXxwsezz_vM-yp-nk8eZnPvt1e3fzY5YboXjMDTdsZFWJzpTMSOdoBcpCbYwqFCoqpQJTYZ1ycMIiFLIcsRqlVY6psgBxlt3tea2Hhe5C8w5hoz00evfhw4uGkFwuUdOUUM2F5NJKySteWQU2pViDcIigEtflnqsL_mNIw-mFH0Kb7Gs-4lKWTBX0vyhZCTFSnLOE-rZHmeD7PqD7541Rvd2vPtiv-As70JjE</recordid><startdate>20210101</startdate><enddate>20210101</enddate><creator>Lee, Seokjin</creator><creator>Kim, Minhan</creator><creator>Shin, Seunghyeon</creator><creator>Park, Sooyoung</creator><creator>Jeong, Youngho</creator><general>MDPI AG</general><scope>AAYXX</scope><scope>CITATION</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0001-8220-192X</orcidid><orcidid>https://orcid.org/0000-0001-9552-8593</orcidid></search><sort><creationdate>20210101</creationdate><title>Data-Dependent Feature Extraction Method Based on Non-Negative Matrix Factorization for Weakly Supervised Domestic Sound Event Detection</title><author>Lee, Seokjin ; Kim, Minhan ; Shin, Seunghyeon ; Park, Sooyoung ; Jeong, Youngho</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c392t-c2c15d97efc71c4ff08a9dabcc969e90449ac8eb031f3dea64751be4d9f1976a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Acoustics</topic><topic>Artificial intelligence</topic><topic>Classification</topic><topic>Datasets</topic><topic>Factorization</topic><topic>feature extraction</topic><topic>Machine learning</topic><topic>Methods</topic><topic>Neural networks</topic><topic>Noise</topic><topic>non-negative matrix factorization</topic><topic>Post-production processing</topic><topic>Principal components analysis</topic><topic>Signal processing</topic><topic>Sound</topic><topic>sound event detection</topic><topic>Speech</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Lee, Seokjin</creatorcontrib><creatorcontrib>Kim, Minhan</creatorcontrib><creatorcontrib>Shin, Seunghyeon</creatorcontrib><creatorcontrib>Park, Sooyoung</creatorcontrib><creatorcontrib>Jeong, Youngho</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Publicly Available Content (ProQuest)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>Applied sciences</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lee, Seokjin</au><au>Kim, Minhan</au><au>Shin, Seunghyeon</au><au>Park, Sooyoung</au><au>Jeong, Youngho</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Data-Dependent Feature Extraction Method Based on Non-Negative Matrix Factorization for Weakly Supervised Domestic Sound Event Detection</atitle><jtitle>Applied sciences</jtitle><date>2021-01-01</date><risdate>2021</risdate><volume>11</volume><issue>3</issue><spage>1040</spage><pages>1040-</pages><issn>2076-3417</issn><eissn>2076-3417</eissn><abstract>In this paper, feature extraction methods are developed based on the non-negative matrix factorization (NMF) algorithm to be applied in weakly supervised sound event detection. Recently, the development of various features and systems have been attempted to tackle the problems of acoustic scene classification and sound event detection. However, most of these systems use data-independent spectral features, e.g., Mel-spectrogram, log-Mel-spectrum, and gammatone filterbank. Some data-dependent feature extraction methods, including the NMF-based methods, recently demonstrated the potential to tackle the problems mentioned above for long-term acoustic signals. In this paper, we further develop the recently proposed NMF-based feature extraction method to enable its application in weakly supervised sound event detection. To achieve this goal, we develop a strategy for training the frequency basis matrix using a heterogeneous database consisting of strongly- and weakly-labeled data. Moreover, we develop a non-iterative version of the NMF-based feature extraction method so that the proposed feature extraction method can be applied as a part of the model structure similar to the modern “on-the-fly” transform method for the Mel-spectrogram. To detect the sound events, the temporal basis is calculated using the NMF method and then used as a feature for the mean-teacher-model-based classifier. The results are improved for the event-wise post-processing method. To evaluate the proposed system, simulations of the weakly supervised sound event detection were conducted using the Detection and Classification of Acoustic Scenes and Events 2020 Task 4 database. The results reveal that the proposed system has F1-score performance comparable with the Mel-spectrogram and gammatonegram and exhibits 3–5% better performance than the log-Mel-spectrum and constant-Q transform.</abstract><cop>Basel</cop><pub>MDPI AG</pub><doi>10.3390/app11031040</doi><orcidid>https://orcid.org/0000-0001-8220-192X</orcidid><orcidid>https://orcid.org/0000-0001-9552-8593</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 2076-3417
ispartof	Applied sciences, 2021-01, Vol.11 (3), p.1040
issn	2076-3417 2076-3417
language	eng
recordid	cdi_doaj_primary_oai_doaj_org_article_0810b23424d442828d9ad417ba3feea9
source	Publicly Available Content (ProQuest)
subjects	Acoustics Artificial intelligence Classification Datasets Factorization feature extraction Machine learning Methods Neural networks Noise non-negative matrix factorization Post-production processing Principal components analysis Signal processing Sound sound event detection Speech
title	Data-Dependent Feature Extraction Method Based on Non-Negative Matrix Factorization for Weakly Supervised Domestic Sound Event Detection
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-10T03%3A14%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Data-Dependent%20Feature%20Extraction%20Method%20Based%20on%20Non-Negative%20Matrix%20Factorization%20for%20Weakly%20Supervised%20Domestic%20Sound%20Event%20Detection&rft.jtitle=Applied%20sciences&rft.au=Lee,%20Seokjin&rft.date=2021-01-01&rft.volume=11&rft.issue=3&rft.spage=1040&rft.pages=1040-&rft.issn=2076-3417&rft.eissn=2076-3417&rft_id=info:doi/10.3390/app11031040&rft_dat=%3Cproquest_doaj_%3E2524471960%3C/proquest_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c392t-c2c15d97efc71c4ff08a9dabcc969e90449ac8eb031f3dea64751be4d9f1976a3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2483359221&rft_id=info:pmid/&rfr_iscdi=true