Loading…

Content based audio classification and retrieval using joint time-frequency analysis

We present an audio classification and retrieval technique that exploits the non-stationary behavior of music signals and extracts features that characterize their spectral change over time. Audio classification provides a solution to incorrect and inefficient manual labelling of audio files on comp...

Full description

Saved in:
Bibliographic Details
Main Authors: Esmaili, S., Krishnan, S., Raahemifar, K.
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 665
container_issue
container_start_page V
container_title
container_volume 5
creator Esmaili, S.
Krishnan, S.
Raahemifar, K.
description We present an audio classification and retrieval technique that exploits the non-stationary behavior of music signals and extracts features that characterize their spectral change over time. Audio classification provides a solution to incorrect and inefficient manual labelling of audio files on computers by allowing users to extract music files based on content similarity rather than labels. In our technique, classification is performed using time-frequency analysis and sounds are classified into 6 music groups consisting of rock, classical, folk, jazz and pop. For each 5 second music segment, the features that are extracted include entropy, centroid, centroid ratio, bandwidth, silence ratio, energy ratio, and location of minimum and maximum energy. Using a database of 143 signals, a set of 10 time-frequency features are extracted and an accuracy of classification of around 93% using regular linear discriminant analysis or 92.3% using the leave-one-out method is achieved.
doi_str_mv 10.1109/ICASSP.2004.1327198
format conference_proceeding
fullrecord <record><control><sourceid>pascalfrancis_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_1327198</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1327198</ieee_id><sourcerecordid>17610995</sourcerecordid><originalsourceid>FETCH-LOGICAL-i505-b899bcccb41624b8c39cc5cd5800098097f750a9f6ffe4e45856d1a028b6e5b23</originalsourceid><addsrcrecordid>eNpFkEtrwzAQhEUf0DTNL8hFlx6dSrJkSccS-oJAC8mht7CSV0XBkVPLKeTfV5BCYWEO-80wDCFzzhacM_vwtnxcrz8WgjG54LXQ3JoLMhG1thW37POSzKw2rFxtpJHiiky4EqxquLQ35DbnHWPMaGkmZLPs04hppA4ythSObeyp7yDnGKKHMfaJQmrpgOMQ8Qc6eswxfdFdH4tpjHuswoDfR0z-VEDoTjnmO3IdoMs4-9Mp2Tw_bZav1er9pVRfVVExVTljrfPeO8kbIZ3xtfVe-VaZ0s4aZnXQioENTQgoUSqjmpYDE8Y1qJyop-T-HHuA7KELAyQf8_YwxD0Mpy3XTdnKqsLNz1xExP_3ebf6F3YjYLc</addsrcrecordid><sourcetype>Index Database</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Content based audio classification and retrieval using joint time-frequency analysis</title><source>IEEE Xplore All Conference Series</source><creator>Esmaili, S. ; Krishnan, S. ; Raahemifar, K.</creator><creatorcontrib>Esmaili, S. ; Krishnan, S. ; Raahemifar, K.</creatorcontrib><description>We present an audio classification and retrieval technique that exploits the non-stationary behavior of music signals and extracts features that characterize their spectral change over time. Audio classification provides a solution to incorrect and inefficient manual labelling of audio files on computers by allowing users to extract music files based on content similarity rather than labels. In our technique, classification is performed using time-frequency analysis and sounds are classified into 6 music groups consisting of rock, classical, folk, jazz and pop. For each 5 second music segment, the features that are extracted include entropy, centroid, centroid ratio, bandwidth, silence ratio, energy ratio, and location of minimum and maximum energy. Using a database of 143 signals, a set of 10 time-frequency features are extracted and an accuracy of classification of around 93% using regular linear discriminant analysis or 92.3% using the leave-one-out method is achieved.</description><identifier>ISSN: 1520-6149</identifier><identifier>ISBN: 9780780384842</identifier><identifier>ISBN: 0780384849</identifier><identifier>EISSN: 2379-190X</identifier><identifier>DOI: 10.1109/ICASSP.2004.1327198</identifier><language>eng</language><publisher>Piscataway, N.J: IEEE</publisher><subject>Applied sciences ; Bandwidth ; Content based retrieval ; Entropy ; Exact sciences and technology ; Feature extraction ; Information, signal and communications theory ; Labeling ; Linear discriminant analysis ; Miscellaneous ; Multiple signal classification ; Music information retrieval ; Signal processing ; Spatial databases ; Telecommunications and information theory ; Time frequency analysis</subject><ispartof>2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2004, Vol.5, p.V-665</ispartof><rights>2006 INIST-CNRS</rights><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1327198$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,4050,4051,27925,54555,54920,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1327198$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=17610995$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Esmaili, S.</creatorcontrib><creatorcontrib>Krishnan, S.</creatorcontrib><creatorcontrib>Raahemifar, K.</creatorcontrib><title>Content based audio classification and retrieval using joint time-frequency analysis</title><title>2004 IEEE International Conference on Acoustics, Speech, and Signal Processing</title><addtitle>ICASSP</addtitle><description>We present an audio classification and retrieval technique that exploits the non-stationary behavior of music signals and extracts features that characterize their spectral change over time. Audio classification provides a solution to incorrect and inefficient manual labelling of audio files on computers by allowing users to extract music files based on content similarity rather than labels. In our technique, classification is performed using time-frequency analysis and sounds are classified into 6 music groups consisting of rock, classical, folk, jazz and pop. For each 5 second music segment, the features that are extracted include entropy, centroid, centroid ratio, bandwidth, silence ratio, energy ratio, and location of minimum and maximum energy. Using a database of 143 signals, a set of 10 time-frequency features are extracted and an accuracy of classification of around 93% using regular linear discriminant analysis or 92.3% using the leave-one-out method is achieved.</description><subject>Applied sciences</subject><subject>Bandwidth</subject><subject>Content based retrieval</subject><subject>Entropy</subject><subject>Exact sciences and technology</subject><subject>Feature extraction</subject><subject>Information, signal and communications theory</subject><subject>Labeling</subject><subject>Linear discriminant analysis</subject><subject>Miscellaneous</subject><subject>Multiple signal classification</subject><subject>Music information retrieval</subject><subject>Signal processing</subject><subject>Spatial databases</subject><subject>Telecommunications and information theory</subject><subject>Time frequency analysis</subject><issn>1520-6149</issn><issn>2379-190X</issn><isbn>9780780384842</isbn><isbn>0780384849</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2004</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNpFkEtrwzAQhEUf0DTNL8hFlx6dSrJkSccS-oJAC8mht7CSV0XBkVPLKeTfV5BCYWEO-80wDCFzzhacM_vwtnxcrz8WgjG54LXQ3JoLMhG1thW37POSzKw2rFxtpJHiiky4EqxquLQ35DbnHWPMaGkmZLPs04hppA4ythSObeyp7yDnGKKHMfaJQmrpgOMQ8Qc6eswxfdFdH4tpjHuswoDfR0z-VEDoTjnmO3IdoMs4-9Mp2Tw_bZav1er9pVRfVVExVTljrfPeO8kbIZ3xtfVe-VaZ0s4aZnXQioENTQgoUSqjmpYDE8Y1qJyop-T-HHuA7KELAyQf8_YwxD0Mpy3XTdnKqsLNz1xExP_3ebf6F3YjYLc</recordid><startdate>2004</startdate><enddate>2004</enddate><creator>Esmaili, S.</creator><creator>Krishnan, S.</creator><creator>Raahemifar, K.</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope><scope>IQODW</scope></search><sort><creationdate>2004</creationdate><title>Content based audio classification and retrieval using joint time-frequency analysis</title><author>Esmaili, S. ; Krishnan, S. ; Raahemifar, K.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i505-b899bcccb41624b8c39cc5cd5800098097f750a9f6ffe4e45856d1a028b6e5b23</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2004</creationdate><topic>Applied sciences</topic><topic>Bandwidth</topic><topic>Content based retrieval</topic><topic>Entropy</topic><topic>Exact sciences and technology</topic><topic>Feature extraction</topic><topic>Information, signal and communications theory</topic><topic>Labeling</topic><topic>Linear discriminant analysis</topic><topic>Miscellaneous</topic><topic>Multiple signal classification</topic><topic>Music information retrieval</topic><topic>Signal processing</topic><topic>Spatial databases</topic><topic>Telecommunications and information theory</topic><topic>Time frequency analysis</topic><toplevel>online_resources</toplevel><creatorcontrib>Esmaili, S.</creatorcontrib><creatorcontrib>Krishnan, S.</creatorcontrib><creatorcontrib>Raahemifar, K.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Xplore</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection><collection>Pascal-Francis</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Esmaili, S.</au><au>Krishnan, S.</au><au>Raahemifar, K.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Content based audio classification and retrieval using joint time-frequency analysis</atitle><btitle>2004 IEEE International Conference on Acoustics, Speech, and Signal Processing</btitle><stitle>ICASSP</stitle><date>2004</date><risdate>2004</risdate><volume>5</volume><spage>V</spage><epage>665</epage><pages>V-665</pages><issn>1520-6149</issn><eissn>2379-190X</eissn><isbn>9780780384842</isbn><isbn>0780384849</isbn><abstract>We present an audio classification and retrieval technique that exploits the non-stationary behavior of music signals and extracts features that characterize their spectral change over time. Audio classification provides a solution to incorrect and inefficient manual labelling of audio files on computers by allowing users to extract music files based on content similarity rather than labels. In our technique, classification is performed using time-frequency analysis and sounds are classified into 6 music groups consisting of rock, classical, folk, jazz and pop. For each 5 second music segment, the features that are extracted include entropy, centroid, centroid ratio, bandwidth, silence ratio, energy ratio, and location of minimum and maximum energy. Using a database of 143 signals, a set of 10 time-frequency features are extracted and an accuracy of classification of around 93% using regular linear discriminant analysis or 92.3% using the leave-one-out method is achieved.</abstract><cop>Piscataway, N.J</cop><pub>IEEE</pub><doi>10.1109/ICASSP.2004.1327198</doi></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1520-6149
ispartof 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2004, Vol.5, p.V-665
issn 1520-6149
2379-190X
language eng
recordid cdi_ieee_primary_1327198
source IEEE Xplore All Conference Series
subjects Applied sciences
Bandwidth
Content based retrieval
Entropy
Exact sciences and technology
Feature extraction
Information, signal and communications theory
Labeling
Linear discriminant analysis
Miscellaneous
Multiple signal classification
Music information retrieval
Signal processing
Spatial databases
Telecommunications and information theory
Time frequency analysis
title Content based audio classification and retrieval using joint time-frequency analysis
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T22%3A47%3A58IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-pascalfrancis_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Content%20based%20audio%20classification%20and%20retrieval%20using%20joint%20time-frequency%20analysis&rft.btitle=2004%20IEEE%20International%20Conference%20on%20Acoustics,%20Speech,%20and%20Signal%20Processing&rft.au=Esmaili,%20S.&rft.date=2004&rft.volume=5&rft.spage=V&rft.epage=665&rft.pages=V-665&rft.issn=1520-6149&rft.eissn=2379-190X&rft.isbn=9780780384842&rft.isbn_list=0780384849&rft_id=info:doi/10.1109/ICASSP.2004.1327198&rft_dat=%3Cpascalfrancis_CHZPO%3E17610995%3C/pascalfrancis_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i505-b899bcccb41624b8c39cc5cd5800098097f750a9f6ffe4e45856d1a028b6e5b23%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=1327198&rfr_iscdi=true