Loading…
Training environmental sound classification models for real-world deployment in edge devices
The interest in smart city technologies has grown in recent years, and a major challenge is to develop methods that can extract useful information from data collected by sensors in the city. One possible scenario is the use of sound sensors to detect passing vehicles, sirens, and other sounds on the...
Saved in:
Published in: | Discover applied sciences 2024-03, Vol.6 (4), p.166, Article 166 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | cdi_FETCH-LOGICAL-c314t-fda4e819bcf46fa70d704ee4eb2fbd6fbb0b34dd512c5c3190c22153022e94d73 |
container_end_page | |
container_issue | 4 |
container_start_page | 166 |
container_title | Discover applied sciences |
container_volume | 6 |
creator | Goulão, Manuel Bandeira, Lourenço Martins, Bruno L. Oliveira, Arlindo |
description | The interest in smart city technologies has grown in recent years, and a major challenge is to develop methods that can extract useful information from data collected by sensors in the city. One possible scenario is the use of sound sensors to detect passing vehicles, sirens, and other sounds on the streets. However, classifying sounds in a street environment is a complex task due to various factors that can affect sound quality, such as weather, traffic volume, and microphone quality. This paper presents a deep learning model for multi-label sound classification that can be deployed in the real world on edge devices. We describe two key components, namely data collection and preparation, and the methodology to train the model including a pre-train using knowledge distillation. We benchmark our models on the ESC-50 dataset and show an accuracy of 85.4%, comparable to similar state-of-the-art models requiring significantly more computational resources. We also evaluated the model using data collected in the real world by early prototypes of luminaires integrating edge devices, with results showing that the approach works well for most vehicles but has significant limitations for the classes “person” and “bicycle”. Given the difference between the benchmarking and the real-world results, we claim that the quality and quantity of public and private data for this type of task is the main limitation. Finally, all results show great benefits in pretraining the model using knowledge distillation. |
doi_str_mv | 10.1007/s42452-024-05803-7 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2986739976</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2986739976</sourcerecordid><originalsourceid>FETCH-LOGICAL-c314t-fda4e819bcf46fa70d704ee4eb2fbd6fbb0b34dd512c5c3190c22153022e94d73</originalsourceid><addsrcrecordid>eNpNkMtKAzEUhoMoWGpfwFXAdfTkMpNmKcUbFNzUnRAyuZSUaVKTaaVv79S6cHV-Dv8FPoRuKdxTAPlQBRMNI8AEgWYOnMgLNOEAgijW0st_-hrNat0AAOcgZaMm6HNVTEwxrbFPh1hy2vo0mB7XvE8O297UGkO0Zog54W12vq845IKLNz35zqV32Pldn4-nHI4Je7f24-sQra836CqYvvrZ352ij-en1eKVLN9f3haPS2I5FQMJzgg_p6qzQbTBSHAShPfCdyx0rg1dBx0XzjWU2WaMKLCM0YYDY14JJ_kU3Z17dyV_7X0d9CbvSxonNVPzVnKlZDu62NllS661-KB3JW5NOWoK-gRSn0HqEaT-Bakl_wF0CWey</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2986739976</pqid></control><display><type>article</type><title>Training environmental sound classification models for real-world deployment in edge devices</title><source>Publicly Available Content Database</source><source>Springer Nature - SpringerLink Journals - Fully Open Access </source><creator>Goulão, Manuel ; Bandeira, Lourenço ; Martins, Bruno ; L. Oliveira, Arlindo</creator><creatorcontrib>Goulão, Manuel ; Bandeira, Lourenço ; Martins, Bruno ; L. Oliveira, Arlindo</creatorcontrib><description>The interest in smart city technologies has grown in recent years, and a major challenge is to develop methods that can extract useful information from data collected by sensors in the city. One possible scenario is the use of sound sensors to detect passing vehicles, sirens, and other sounds on the streets. However, classifying sounds in a street environment is a complex task due to various factors that can affect sound quality, such as weather, traffic volume, and microphone quality. This paper presents a deep learning model for multi-label sound classification that can be deployed in the real world on edge devices. We describe two key components, namely data collection and preparation, and the methodology to train the model including a pre-train using knowledge distillation. We benchmark our models on the ESC-50 dataset and show an accuracy of 85.4%, comparable to similar state-of-the-art models requiring significantly more computational resources. We also evaluated the model using data collected in the real world by early prototypes of luminaires integrating edge devices, with results showing that the approach works well for most vehicles but has significant limitations for the classes “person” and “bicycle”. Given the difference between the benchmarking and the real-world results, we claim that the quality and quantity of public and private data for this type of task is the main limitation. Finally, all results show great benefits in pretraining the model using knowledge distillation.</description><identifier>ISSN: 3004-9261</identifier><identifier>ISSN: 2523-3963</identifier><identifier>EISSN: 3004-9261</identifier><identifier>EISSN: 2523-3971</identifier><identifier>DOI: 10.1007/s42452-024-05803-7</identifier><language>eng</language><publisher>London: Springer Nature B.V</publisher><subject>Accuracy ; Background noise ; Bicycles ; Classification ; Data collection ; Datasets ; Deep learning ; Distillation ; Energy consumption ; Environment models ; Information processing ; Knowledge ; Machine learning ; Neural networks ; Ontology ; Sensors ; Sirens ; Traffic volume</subject><ispartof>Discover applied sciences, 2024-03, Vol.6 (4), p.166, Article 166</ispartof><rights>The Author(s) 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c314t-fda4e819bcf46fa70d704ee4eb2fbd6fbb0b34dd512c5c3190c22153022e94d73</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/2986739976/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2986739976?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,25732,27903,27904,36991,44569,74872</link.rule.ids></links><search><creatorcontrib>Goulão, Manuel</creatorcontrib><creatorcontrib>Bandeira, Lourenço</creatorcontrib><creatorcontrib>Martins, Bruno</creatorcontrib><creatorcontrib>L. Oliveira, Arlindo</creatorcontrib><title>Training environmental sound classification models for real-world deployment in edge devices</title><title>Discover applied sciences</title><description>The interest in smart city technologies has grown in recent years, and a major challenge is to develop methods that can extract useful information from data collected by sensors in the city. One possible scenario is the use of sound sensors to detect passing vehicles, sirens, and other sounds on the streets. However, classifying sounds in a street environment is a complex task due to various factors that can affect sound quality, such as weather, traffic volume, and microphone quality. This paper presents a deep learning model for multi-label sound classification that can be deployed in the real world on edge devices. We describe two key components, namely data collection and preparation, and the methodology to train the model including a pre-train using knowledge distillation. We benchmark our models on the ESC-50 dataset and show an accuracy of 85.4%, comparable to similar state-of-the-art models requiring significantly more computational resources. We also evaluated the model using data collected in the real world by early prototypes of luminaires integrating edge devices, with results showing that the approach works well for most vehicles but has significant limitations for the classes “person” and “bicycle”. Given the difference between the benchmarking and the real-world results, we claim that the quality and quantity of public and private data for this type of task is the main limitation. Finally, all results show great benefits in pretraining the model using knowledge distillation.</description><subject>Accuracy</subject><subject>Background noise</subject><subject>Bicycles</subject><subject>Classification</subject><subject>Data collection</subject><subject>Datasets</subject><subject>Deep learning</subject><subject>Distillation</subject><subject>Energy consumption</subject><subject>Environment models</subject><subject>Information processing</subject><subject>Knowledge</subject><subject>Machine learning</subject><subject>Neural networks</subject><subject>Ontology</subject><subject>Sensors</subject><subject>Sirens</subject><subject>Traffic volume</subject><issn>3004-9261</issn><issn>2523-3963</issn><issn>3004-9261</issn><issn>2523-3971</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNpNkMtKAzEUhoMoWGpfwFXAdfTkMpNmKcUbFNzUnRAyuZSUaVKTaaVv79S6cHV-Dv8FPoRuKdxTAPlQBRMNI8AEgWYOnMgLNOEAgijW0st_-hrNat0AAOcgZaMm6HNVTEwxrbFPh1hy2vo0mB7XvE8O297UGkO0Zog54W12vq845IKLNz35zqV32Pldn4-nHI4Je7f24-sQra836CqYvvrZ352ij-en1eKVLN9f3haPS2I5FQMJzgg_p6qzQbTBSHAShPfCdyx0rg1dBx0XzjWU2WaMKLCM0YYDY14JJ_kU3Z17dyV_7X0d9CbvSxonNVPzVnKlZDu62NllS661-KB3JW5NOWoK-gRSn0HqEaT-Bakl_wF0CWey</recordid><startdate>20240326</startdate><enddate>20240326</enddate><creator>Goulão, Manuel</creator><creator>Bandeira, Lourenço</creator><creator>Martins, Bruno</creator><creator>L. Oliveira, Arlindo</creator><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7XB</scope><scope>88I</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>ATCPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>BKSAR</scope><scope>CCPQU</scope><scope>D1I</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>KB.</scope><scope>L6V</scope><scope>M2P</scope><scope>M7S</scope><scope>PATMY</scope><scope>PCBAR</scope><scope>PDBOC</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>PYCSY</scope><scope>Q9U</scope></search><sort><creationdate>20240326</creationdate><title>Training environmental sound classification models for real-world deployment in edge devices</title><author>Goulão, Manuel ; Bandeira, Lourenço ; Martins, Bruno ; L. Oliveira, Arlindo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c314t-fda4e819bcf46fa70d704ee4eb2fbd6fbb0b34dd512c5c3190c22153022e94d73</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Accuracy</topic><topic>Background noise</topic><topic>Bicycles</topic><topic>Classification</topic><topic>Data collection</topic><topic>Datasets</topic><topic>Deep learning</topic><topic>Distillation</topic><topic>Energy consumption</topic><topic>Environment models</topic><topic>Information processing</topic><topic>Knowledge</topic><topic>Machine learning</topic><topic>Neural networks</topic><topic>Ontology</topic><topic>Sensors</topic><topic>Sirens</topic><topic>Traffic volume</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Goulão, Manuel</creatorcontrib><creatorcontrib>Bandeira, Lourenço</creatorcontrib><creatorcontrib>Martins, Bruno</creatorcontrib><creatorcontrib>L. Oliveira, Arlindo</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Science Database (Alumni Edition)</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central UK/Ireland</collection><collection>Agricultural & Environmental Science Collection</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Earth, Atmospheric & Aquatic Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Materials Science Collection</collection><collection>ProQuest Central</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>https://resources.nclive.org/materials</collection><collection>ProQuest Engineering Collection</collection><collection>ProQuest Science Journals</collection><collection>Engineering Database</collection><collection>Environmental Science Database</collection><collection>Earth, Atmospheric & Aquatic Science Database</collection><collection>Materials science collection</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection><collection>Environmental Science Collection</collection><collection>ProQuest Central Basic</collection><jtitle>Discover applied sciences</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Goulão, Manuel</au><au>Bandeira, Lourenço</au><au>Martins, Bruno</au><au>L. Oliveira, Arlindo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Training environmental sound classification models for real-world deployment in edge devices</atitle><jtitle>Discover applied sciences</jtitle><date>2024-03-26</date><risdate>2024</risdate><volume>6</volume><issue>4</issue><spage>166</spage><pages>166-</pages><artnum>166</artnum><issn>3004-9261</issn><issn>2523-3963</issn><eissn>3004-9261</eissn><eissn>2523-3971</eissn><abstract>The interest in smart city technologies has grown in recent years, and a major challenge is to develop methods that can extract useful information from data collected by sensors in the city. One possible scenario is the use of sound sensors to detect passing vehicles, sirens, and other sounds on the streets. However, classifying sounds in a street environment is a complex task due to various factors that can affect sound quality, such as weather, traffic volume, and microphone quality. This paper presents a deep learning model for multi-label sound classification that can be deployed in the real world on edge devices. We describe two key components, namely data collection and preparation, and the methodology to train the model including a pre-train using knowledge distillation. We benchmark our models on the ESC-50 dataset and show an accuracy of 85.4%, comparable to similar state-of-the-art models requiring significantly more computational resources. We also evaluated the model using data collected in the real world by early prototypes of luminaires integrating edge devices, with results showing that the approach works well for most vehicles but has significant limitations for the classes “person” and “bicycle”. Given the difference between the benchmarking and the real-world results, we claim that the quality and quantity of public and private data for this type of task is the main limitation. Finally, all results show great benefits in pretraining the model using knowledge distillation.</abstract><cop>London</cop><pub>Springer Nature B.V</pub><doi>10.1007/s42452-024-05803-7</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 3004-9261 |
ispartof | Discover applied sciences, 2024-03, Vol.6 (4), p.166, Article 166 |
issn | 3004-9261 2523-3963 3004-9261 2523-3971 |
language | eng |
recordid | cdi_proquest_journals_2986739976 |
source | Publicly Available Content Database; Springer Nature - SpringerLink Journals - Fully Open Access |
subjects | Accuracy Background noise Bicycles Classification Data collection Datasets Deep learning Distillation Energy consumption Environment models Information processing Knowledge Machine learning Neural networks Ontology Sensors Sirens Traffic volume |
title | Training environmental sound classification models for real-world deployment in edge devices |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-25T01%3A27%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Training%20environmental%20sound%20classification%20models%20for%20real-world%20deployment%20in%20edge%20devices&rft.jtitle=Discover%20applied%20sciences&rft.au=Goul%C3%A3o,%20Manuel&rft.date=2024-03-26&rft.volume=6&rft.issue=4&rft.spage=166&rft.pages=166-&rft.artnum=166&rft.issn=3004-9261&rft.eissn=3004-9261&rft_id=info:doi/10.1007/s42452-024-05803-7&rft_dat=%3Cproquest_cross%3E2986739976%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c314t-fda4e819bcf46fa70d704ee4eb2fbd6fbb0b34dd512c5c3190c22153022e94d73%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2986739976&rft_id=info:pmid/&rfr_iscdi=true |