Loading…

Predictability of Music Descriptor Time Series and its Application to Cover Song Detection

Intuitively, music has both predictable and unpredictable components. In this paper, we assess this qualitative statement in a quantitative way using common time series models fitted to state-of-the-art music descriptors. These descriptors cover different musical facets and are extracted from a larg...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on audio, speech, and language processing speech, and language processing, 2012-02, Vol.20 (2), p.514-525
Main Authors: Serra, J., Kantz, H., Serra, X., Andrzejak, R. G.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c380t-132a78d60406a944ef16ad6710672ecfe09c62f9eb4e86c0e538785f862395633
cites cdi_FETCH-LOGICAL-c380t-132a78d60406a944ef16ad6710672ecfe09c62f9eb4e86c0e538785f862395633
container_end_page 525
container_issue 2
container_start_page 514
container_title IEEE transactions on audio, speech, and language processing
container_volume 20
creator Serra, J.
Kantz, H.
Serra, X.
Andrzejak, R. G.
description Intuitively, music has both predictable and unpredictable components. In this paper, we assess this qualitative statement in a quantitative way using common time series models fitted to state-of-the-art music descriptors. These descriptors cover different musical facets and are extracted from a large collection of real audio recordings comprising a variety of musical genres. Our findings show that music descriptor time series exhibit a certain predictability not only for short time intervals, but also for mid-term and relatively long intervals. This fact is observed independently of the descriptor, musical facet and time series model we consider. Moreover, we show that our findings are not only of theoretical relevance but can also have practical impact. To this end we demonstrate that music predictability at relatively long time intervals can be exploited in a real-world application, namely the automatic identification of cover songs (i.e., different renditions or versions of the same musical piece). Importantly, this prediction strategy yields a parameter-free approach for cover song identification that is substantially faster, allows for reduced computational storage and still maintains highly competitive accuracies when compared to state-of-the-art systems.
doi_str_mv 10.1109/TASL.2011.2162321
format article
fullrecord <record><control><sourceid>csuc_pasca</sourceid><recordid>TN_cdi_pascalfrancis_primary_25473684</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5955095</ieee_id><sourcerecordid>oai_recercat_cat_2072_213269</sourcerecordid><originalsourceid>FETCH-LOGICAL-c380t-132a78d60406a944ef16ad6710672ecfe09c62f9eb4e86c0e538785f862395633</originalsourceid><addsrcrecordid>eNpNkEtLAzEQgBdRsFZ_gHjJxePWvDc5lvqEikLrxUtIs7MSaTdLkgr99-7SUj2ESSbzZSZfUVwTPCEE67vldDGfUEzIhBJJGSUnxYgIocpKU3563BN5Xlyk9I0xZ5KTUfH5HqH2LtuVX_u8Q6FBr9vkHbqH5KLvcoho6TeAFhA9JGTbGvmc0LTr1t7Z7EOLckCz8AMRLUL71YMZ3JC_LM4au05wdYjj4uPxYTl7LudvTy-z6bx0TOFcEkZtpWqJOZZWcw4NkbaWFcGyouAawNpJ2mhYcVDSYRBMVUo0qv-mFpKxcUH277q0dSaCg9gPZoL1f4dhUVxRQ_t2Uv9jYkgpQmO66Dc27gzBZhBqBqFmEGoOQnvmds90Njm7bqJtnU9HkApeMal4X3ezr_MAcLwWWgisBfsFroR99Q</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Predictability of Music Descriptor Time Series and its Application to Cover Song Detection</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Serra, J. ; Kantz, H. ; Serra, X. ; Andrzejak, R. G.</creator><creatorcontrib>Serra, J. ; Kantz, H. ; Serra, X. ; Andrzejak, R. G.</creatorcontrib><description>Intuitively, music has both predictable and unpredictable components. In this paper, we assess this qualitative statement in a quantitative way using common time series models fitted to state-of-the-art music descriptors. These descriptors cover different musical facets and are extracted from a large collection of real audio recordings comprising a variety of musical genres. Our findings show that music descriptor time series exhibit a certain predictability not only for short time intervals, but also for mid-term and relatively long intervals. This fact is observed independently of the descriptor, musical facet and time series model we consider. Moreover, we show that our findings are not only of theoretical relevance but can also have practical impact. To this end we demonstrate that music predictability at relatively long time intervals can be exploited in a real-world application, namely the automatic identification of cover songs (i.e., different renditions or versions of the same musical piece). Importantly, this prediction strategy yields a parameter-free approach for cover song identification that is substantially faster, allows for reduced computational storage and still maintains highly competitive accuracies when compared to state-of-the-art systems.</description><identifier>ISSN: 1558-7916</identifier><identifier>EISSN: 1558-7924</identifier><identifier>DOI: 10.1109/TASL.2011.2162321</identifier><identifier>CODEN: ITASD8</identifier><language>eng</language><publisher>Piscataway, NJ: IEEE</publisher><subject>Acoustic signal analysis ; Anàlisi ; Applied sciences ; Computational modeling ; Exact sciences and technology ; information retrieval ; Information theory ; Information, signal and communications theory ; Informàtica ; Materials ; music ; Música ; prediction methods ; Predictive models ; Signal and communications theory ; Signal representation. Spectral analysis ; Signal, noise ; Sèries temporals ; Telecommunications and information theory ; Timbre ; time series ; Time series analysis ; Tractament per ordinador</subject><ispartof>IEEE transactions on audio, speech, and language processing, 2012-02, Vol.20 (2), p.514-525</ispartof><rights>2015 INIST-CNRS</rights><rights>info:eu-repo/semantics/openAccess © 2011 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works./nThe final published article can be found at &lt;a href="http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5955095&amp;tag=1"&gt;http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5955095&amp;tag=1&lt;/a&gt;</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c380t-132a78d60406a944ef16ad6710672ecfe09c62f9eb4e86c0e538785f862395633</citedby><cites>FETCH-LOGICAL-c380t-132a78d60406a944ef16ad6710672ecfe09c62f9eb4e86c0e538785f862395633</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5955095$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>230,314,776,780,881,27901,27902,54771</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=25473684$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Serra, J.</creatorcontrib><creatorcontrib>Kantz, H.</creatorcontrib><creatorcontrib>Serra, X.</creatorcontrib><creatorcontrib>Andrzejak, R. G.</creatorcontrib><title>Predictability of Music Descriptor Time Series and its Application to Cover Song Detection</title><title>IEEE transactions on audio, speech, and language processing</title><addtitle>TASL</addtitle><description>Intuitively, music has both predictable and unpredictable components. In this paper, we assess this qualitative statement in a quantitative way using common time series models fitted to state-of-the-art music descriptors. These descriptors cover different musical facets and are extracted from a large collection of real audio recordings comprising a variety of musical genres. Our findings show that music descriptor time series exhibit a certain predictability not only for short time intervals, but also for mid-term and relatively long intervals. This fact is observed independently of the descriptor, musical facet and time series model we consider. Moreover, we show that our findings are not only of theoretical relevance but can also have practical impact. To this end we demonstrate that music predictability at relatively long time intervals can be exploited in a real-world application, namely the automatic identification of cover songs (i.e., different renditions or versions of the same musical piece). Importantly, this prediction strategy yields a parameter-free approach for cover song identification that is substantially faster, allows for reduced computational storage and still maintains highly competitive accuracies when compared to state-of-the-art systems.</description><subject>Acoustic signal analysis</subject><subject>Anàlisi</subject><subject>Applied sciences</subject><subject>Computational modeling</subject><subject>Exact sciences and technology</subject><subject>information retrieval</subject><subject>Information theory</subject><subject>Information, signal and communications theory</subject><subject>Informàtica</subject><subject>Materials</subject><subject>music</subject><subject>Música</subject><subject>prediction methods</subject><subject>Predictive models</subject><subject>Signal and communications theory</subject><subject>Signal representation. Spectral analysis</subject><subject>Signal, noise</subject><subject>Sèries temporals</subject><subject>Telecommunications and information theory</subject><subject>Timbre</subject><subject>time series</subject><subject>Time series analysis</subject><subject>Tractament per ordinador</subject><issn>1558-7916</issn><issn>1558-7924</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2012</creationdate><recordtype>article</recordtype><recordid>eNpNkEtLAzEQgBdRsFZ_gHjJxePWvDc5lvqEikLrxUtIs7MSaTdLkgr99-7SUj2ESSbzZSZfUVwTPCEE67vldDGfUEzIhBJJGSUnxYgIocpKU3563BN5Xlyk9I0xZ5KTUfH5HqH2LtuVX_u8Q6FBr9vkHbqH5KLvcoho6TeAFhA9JGTbGvmc0LTr1t7Z7EOLckCz8AMRLUL71YMZ3JC_LM4au05wdYjj4uPxYTl7LudvTy-z6bx0TOFcEkZtpWqJOZZWcw4NkbaWFcGyouAawNpJ2mhYcVDSYRBMVUo0qv-mFpKxcUH277q0dSaCg9gPZoL1f4dhUVxRQ_t2Uv9jYkgpQmO66Dc27gzBZhBqBqFmEGoOQnvmds90Njm7bqJtnU9HkApeMal4X3ezr_MAcLwWWgisBfsFroR99Q</recordid><startdate>20120201</startdate><enddate>20120201</enddate><creator>Serra, J.</creator><creator>Kantz, H.</creator><creator>Serra, X.</creator><creator>Andrzejak, R. G.</creator><general>IEEE</general><general>Institute of Electrical and Electronics Engineers</general><general>Institute of Electrical and Electronics Engineers (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>XX2</scope></search><sort><creationdate>20120201</creationdate><title>Predictability of Music Descriptor Time Series and its Application to Cover Song Detection</title><author>Serra, J. ; Kantz, H. ; Serra, X. ; Andrzejak, R. G.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c380t-132a78d60406a944ef16ad6710672ecfe09c62f9eb4e86c0e538785f862395633</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Acoustic signal analysis</topic><topic>Anàlisi</topic><topic>Applied sciences</topic><topic>Computational modeling</topic><topic>Exact sciences and technology</topic><topic>information retrieval</topic><topic>Information theory</topic><topic>Information, signal and communications theory</topic><topic>Informàtica</topic><topic>Materials</topic><topic>music</topic><topic>Música</topic><topic>prediction methods</topic><topic>Predictive models</topic><topic>Signal and communications theory</topic><topic>Signal representation. Spectral analysis</topic><topic>Signal, noise</topic><topic>Sèries temporals</topic><topic>Telecommunications and information theory</topic><topic>Timbre</topic><topic>time series</topic><topic>Time series analysis</topic><topic>Tractament per ordinador</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Serra, J.</creatorcontrib><creatorcontrib>Kantz, H.</creatorcontrib><creatorcontrib>Serra, X.</creatorcontrib><creatorcontrib>Andrzejak, R. G.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998–Present</collection><collection>IEL</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Recercat</collection><jtitle>IEEE transactions on audio, speech, and language processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Serra, J.</au><au>Kantz, H.</au><au>Serra, X.</au><au>Andrzejak, R. G.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Predictability of Music Descriptor Time Series and its Application to Cover Song Detection</atitle><jtitle>IEEE transactions on audio, speech, and language processing</jtitle><stitle>TASL</stitle><date>2012-02-01</date><risdate>2012</risdate><volume>20</volume><issue>2</issue><spage>514</spage><epage>525</epage><pages>514-525</pages><issn>1558-7916</issn><eissn>1558-7924</eissn><coden>ITASD8</coden><abstract>Intuitively, music has both predictable and unpredictable components. In this paper, we assess this qualitative statement in a quantitative way using common time series models fitted to state-of-the-art music descriptors. These descriptors cover different musical facets and are extracted from a large collection of real audio recordings comprising a variety of musical genres. Our findings show that music descriptor time series exhibit a certain predictability not only for short time intervals, but also for mid-term and relatively long intervals. This fact is observed independently of the descriptor, musical facet and time series model we consider. Moreover, we show that our findings are not only of theoretical relevance but can also have practical impact. To this end we demonstrate that music predictability at relatively long time intervals can be exploited in a real-world application, namely the automatic identification of cover songs (i.e., different renditions or versions of the same musical piece). Importantly, this prediction strategy yields a parameter-free approach for cover song identification that is substantially faster, allows for reduced computational storage and still maintains highly competitive accuracies when compared to state-of-the-art systems.</abstract><cop>Piscataway, NJ</cop><pub>IEEE</pub><doi>10.1109/TASL.2011.2162321</doi><tpages>12</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1558-7916
ispartof IEEE transactions on audio, speech, and language processing, 2012-02, Vol.20 (2), p.514-525
issn 1558-7916
1558-7924
language eng
recordid cdi_pascalfrancis_primary_25473684
source IEEE Electronic Library (IEL) Journals
subjects Acoustic signal analysis
Anàlisi
Applied sciences
Computational modeling
Exact sciences and technology
information retrieval
Information theory
Information, signal and communications theory
Informàtica
Materials
music
Música
prediction methods
Predictive models
Signal and communications theory
Signal representation. Spectral analysis
Signal, noise
Sèries temporals
Telecommunications and information theory
Timbre
time series
Time series analysis
Tractament per ordinador
title Predictability of Music Descriptor Time Series and its Application to Cover Song Detection
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-23T13%3A40%3A31IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-csuc_pasca&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Predictability%20of%20Music%20Descriptor%20Time%20Series%20and%20its%20Application%20to%20Cover%20Song%20Detection&rft.jtitle=IEEE%20transactions%20on%20audio,%20speech,%20and%20language%20processing&rft.au=Serra,%20J.&rft.date=2012-02-01&rft.volume=20&rft.issue=2&rft.spage=514&rft.epage=525&rft.pages=514-525&rft.issn=1558-7916&rft.eissn=1558-7924&rft.coden=ITASD8&rft_id=info:doi/10.1109/TASL.2011.2162321&rft_dat=%3Ccsuc_pasca%3Eoai_recercat_cat_2072_213269%3C/csuc_pasca%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c380t-132a78d60406a944ef16ad6710672ecfe09c62f9eb4e86c0e538785f862395633%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=5955095&rfr_iscdi=true