Loading…
Content based audio classification and retrieval using joint time-frequency analysis
We present an audio classification and retrieval technique that exploits the non-stationary behavior of music signals and extracts features that characterize their spectral change over time. Audio classification provides a solution to incorrect and inefficient manual labelling of audio files on comp...
Saved in:
Main Authors: | , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | 665 |
container_issue | |
container_start_page | V |
container_title | |
container_volume | 5 |
creator | Esmaili, S. Krishnan, S. Raahemifar, K. |
description | We present an audio classification and retrieval technique that exploits the non-stationary behavior of music signals and extracts features that characterize their spectral change over time. Audio classification provides a solution to incorrect and inefficient manual labelling of audio files on computers by allowing users to extract music files based on content similarity rather than labels. In our technique, classification is performed using time-frequency analysis and sounds are classified into 6 music groups consisting of rock, classical, folk, jazz and pop. For each 5 second music segment, the features that are extracted include entropy, centroid, centroid ratio, bandwidth, silence ratio, energy ratio, and location of minimum and maximum energy. Using a database of 143 signals, a set of 10 time-frequency features are extracted and an accuracy of classification of around 93% using regular linear discriminant analysis or 92.3% using the leave-one-out method is achieved. |
doi_str_mv | 10.1109/ICASSP.2004.1327198 |
format | conference_proceeding |
fullrecord | <record><control><sourceid>pascalfrancis_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_1327198</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1327198</ieee_id><sourcerecordid>17610995</sourcerecordid><originalsourceid>FETCH-LOGICAL-i505-b899bcccb41624b8c39cc5cd5800098097f750a9f6ffe4e45856d1a028b6e5b23</originalsourceid><addsrcrecordid>eNpFkEtrwzAQhEUf0DTNL8hFlx6dSrJkSccS-oJAC8mht7CSV0XBkVPLKeTfV5BCYWEO-80wDCFzzhacM_vwtnxcrz8WgjG54LXQ3JoLMhG1thW37POSzKw2rFxtpJHiiky4EqxquLQ35DbnHWPMaGkmZLPs04hppA4ythSObeyp7yDnGKKHMfaJQmrpgOMQ8Qc6eswxfdFdH4tpjHuswoDfR0z-VEDoTjnmO3IdoMs4-9Mp2Tw_bZav1er9pVRfVVExVTljrfPeO8kbIZ3xtfVe-VaZ0s4aZnXQioENTQgoUSqjmpYDE8Y1qJyop-T-HHuA7KELAyQf8_YwxD0Mpy3XTdnKqsLNz1xExP_3ebf6F3YjYLc</addsrcrecordid><sourcetype>Index Database</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Content based audio classification and retrieval using joint time-frequency analysis</title><source>IEEE Xplore All Conference Series</source><creator>Esmaili, S. ; Krishnan, S. ; Raahemifar, K.</creator><creatorcontrib>Esmaili, S. ; Krishnan, S. ; Raahemifar, K.</creatorcontrib><description>We present an audio classification and retrieval technique that exploits the non-stationary behavior of music signals and extracts features that characterize their spectral change over time. Audio classification provides a solution to incorrect and inefficient manual labelling of audio files on computers by allowing users to extract music files based on content similarity rather than labels. In our technique, classification is performed using time-frequency analysis and sounds are classified into 6 music groups consisting of rock, classical, folk, jazz and pop. For each 5 second music segment, the features that are extracted include entropy, centroid, centroid ratio, bandwidth, silence ratio, energy ratio, and location of minimum and maximum energy. Using a database of 143 signals, a set of 10 time-frequency features are extracted and an accuracy of classification of around 93% using regular linear discriminant analysis or 92.3% using the leave-one-out method is achieved.</description><identifier>ISSN: 1520-6149</identifier><identifier>ISBN: 9780780384842</identifier><identifier>ISBN: 0780384849</identifier><identifier>EISSN: 2379-190X</identifier><identifier>DOI: 10.1109/ICASSP.2004.1327198</identifier><language>eng</language><publisher>Piscataway, N.J: IEEE</publisher><subject>Applied sciences ; Bandwidth ; Content based retrieval ; Entropy ; Exact sciences and technology ; Feature extraction ; Information, signal and communications theory ; Labeling ; Linear discriminant analysis ; Miscellaneous ; Multiple signal classification ; Music information retrieval ; Signal processing ; Spatial databases ; Telecommunications and information theory ; Time frequency analysis</subject><ispartof>2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2004, Vol.5, p.V-665</ispartof><rights>2006 INIST-CNRS</rights><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1327198$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,4050,4051,27925,54555,54920,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1327198$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=17610995$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Esmaili, S.</creatorcontrib><creatorcontrib>Krishnan, S.</creatorcontrib><creatorcontrib>Raahemifar, K.</creatorcontrib><title>Content based audio classification and retrieval using joint time-frequency analysis</title><title>2004 IEEE International Conference on Acoustics, Speech, and Signal Processing</title><addtitle>ICASSP</addtitle><description>We present an audio classification and retrieval technique that exploits the non-stationary behavior of music signals and extracts features that characterize their spectral change over time. Audio classification provides a solution to incorrect and inefficient manual labelling of audio files on computers by allowing users to extract music files based on content similarity rather than labels. In our technique, classification is performed using time-frequency analysis and sounds are classified into 6 music groups consisting of rock, classical, folk, jazz and pop. For each 5 second music segment, the features that are extracted include entropy, centroid, centroid ratio, bandwidth, silence ratio, energy ratio, and location of minimum and maximum energy. Using a database of 143 signals, a set of 10 time-frequency features are extracted and an accuracy of classification of around 93% using regular linear discriminant analysis or 92.3% using the leave-one-out method is achieved.</description><subject>Applied sciences</subject><subject>Bandwidth</subject><subject>Content based retrieval</subject><subject>Entropy</subject><subject>Exact sciences and technology</subject><subject>Feature extraction</subject><subject>Information, signal and communications theory</subject><subject>Labeling</subject><subject>Linear discriminant analysis</subject><subject>Miscellaneous</subject><subject>Multiple signal classification</subject><subject>Music information retrieval</subject><subject>Signal processing</subject><subject>Spatial databases</subject><subject>Telecommunications and information theory</subject><subject>Time frequency analysis</subject><issn>1520-6149</issn><issn>2379-190X</issn><isbn>9780780384842</isbn><isbn>0780384849</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2004</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNpFkEtrwzAQhEUf0DTNL8hFlx6dSrJkSccS-oJAC8mht7CSV0XBkVPLKeTfV5BCYWEO-80wDCFzzhacM_vwtnxcrz8WgjG54LXQ3JoLMhG1thW37POSzKw2rFxtpJHiiky4EqxquLQ35DbnHWPMaGkmZLPs04hppA4ythSObeyp7yDnGKKHMfaJQmrpgOMQ8Qc6eswxfdFdH4tpjHuswoDfR0z-VEDoTjnmO3IdoMs4-9Mp2Tw_bZav1er9pVRfVVExVTljrfPeO8kbIZ3xtfVe-VaZ0s4aZnXQioENTQgoUSqjmpYDE8Y1qJyop-T-HHuA7KELAyQf8_YwxD0Mpy3XTdnKqsLNz1xExP_3ebf6F3YjYLc</recordid><startdate>2004</startdate><enddate>2004</enddate><creator>Esmaili, S.</creator><creator>Krishnan, S.</creator><creator>Raahemifar, K.</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope><scope>IQODW</scope></search><sort><creationdate>2004</creationdate><title>Content based audio classification and retrieval using joint time-frequency analysis</title><author>Esmaili, S. ; Krishnan, S. ; Raahemifar, K.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i505-b899bcccb41624b8c39cc5cd5800098097f750a9f6ffe4e45856d1a028b6e5b23</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2004</creationdate><topic>Applied sciences</topic><topic>Bandwidth</topic><topic>Content based retrieval</topic><topic>Entropy</topic><topic>Exact sciences and technology</topic><topic>Feature extraction</topic><topic>Information, signal and communications theory</topic><topic>Labeling</topic><topic>Linear discriminant analysis</topic><topic>Miscellaneous</topic><topic>Multiple signal classification</topic><topic>Music information retrieval</topic><topic>Signal processing</topic><topic>Spatial databases</topic><topic>Telecommunications and information theory</topic><topic>Time frequency analysis</topic><toplevel>online_resources</toplevel><creatorcontrib>Esmaili, S.</creatorcontrib><creatorcontrib>Krishnan, S.</creatorcontrib><creatorcontrib>Raahemifar, K.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Xplore</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection><collection>Pascal-Francis</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Esmaili, S.</au><au>Krishnan, S.</au><au>Raahemifar, K.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Content based audio classification and retrieval using joint time-frequency analysis</atitle><btitle>2004 IEEE International Conference on Acoustics, Speech, and Signal Processing</btitle><stitle>ICASSP</stitle><date>2004</date><risdate>2004</risdate><volume>5</volume><spage>V</spage><epage>665</epage><pages>V-665</pages><issn>1520-6149</issn><eissn>2379-190X</eissn><isbn>9780780384842</isbn><isbn>0780384849</isbn><abstract>We present an audio classification and retrieval technique that exploits the non-stationary behavior of music signals and extracts features that characterize their spectral change over time. Audio classification provides a solution to incorrect and inefficient manual labelling of audio files on computers by allowing users to extract music files based on content similarity rather than labels. In our technique, classification is performed using time-frequency analysis and sounds are classified into 6 music groups consisting of rock, classical, folk, jazz and pop. For each 5 second music segment, the features that are extracted include entropy, centroid, centroid ratio, bandwidth, silence ratio, energy ratio, and location of minimum and maximum energy. Using a database of 143 signals, a set of 10 time-frequency features are extracted and an accuracy of classification of around 93% using regular linear discriminant analysis or 92.3% using the leave-one-out method is achieved.</abstract><cop>Piscataway, N.J</cop><pub>IEEE</pub><doi>10.1109/ICASSP.2004.1327198</doi></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1520-6149 |
ispartof | 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2004, Vol.5, p.V-665 |
issn | 1520-6149 2379-190X |
language | eng |
recordid | cdi_ieee_primary_1327198 |
source | IEEE Xplore All Conference Series |
subjects | Applied sciences Bandwidth Content based retrieval Entropy Exact sciences and technology Feature extraction Information, signal and communications theory Labeling Linear discriminant analysis Miscellaneous Multiple signal classification Music information retrieval Signal processing Spatial databases Telecommunications and information theory Time frequency analysis |
title | Content based audio classification and retrieval using joint time-frequency analysis |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T22%3A47%3A58IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-pascalfrancis_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Content%20based%20audio%20classification%20and%20retrieval%20using%20joint%20time-frequency%20analysis&rft.btitle=2004%20IEEE%20International%20Conference%20on%20Acoustics,%20Speech,%20and%20Signal%20Processing&rft.au=Esmaili,%20S.&rft.date=2004&rft.volume=5&rft.spage=V&rft.epage=665&rft.pages=V-665&rft.issn=1520-6149&rft.eissn=2379-190X&rft.isbn=9780780384842&rft.isbn_list=0780384849&rft_id=info:doi/10.1109/ICASSP.2004.1327198&rft_dat=%3Cpascalfrancis_CHZPO%3E17610995%3C/pascalfrancis_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i505-b899bcccb41624b8c39cc5cd5800098097f750a9f6ffe4e45856d1a028b6e5b23%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=1327198&rfr_iscdi=true |