Loading…
(\text{M}^\text{6}(\text{GPT})^\text{3}\): Generating Multitrack Modifiable Multi-Minute MIDI Music from Text using Genetic algorithms, Probabilistic methods and GPT Models in any Progression and Time signature
This work introduces the \(\text{M}^\text{6}(\text{GPT})^\text{3}\) composer system, capable of generating complete, multi-minute musical compositions with complex structures in any time signature, in the MIDI domain from input descriptions in natural language. The system utilizes an autoregressive...
Saved in:
Published in: | arXiv.org 2024-11 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | |
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Poćwiardowski, Jakub Modrzejewski, Mateusz Tatara, Marek S |
description | This work introduces the \(\text{M}^\text{6}(\text{GPT})^\text{3}\) composer system, capable of generating complete, multi-minute musical compositions with complex structures in any time signature, in the MIDI domain from input descriptions in natural language. The system utilizes an autoregressive transformer language model to map natural language prompts to composition parameters in JSON format. The defined structure includes time signature, scales, chord progressions, and valence-arousal values, from which accompaniment, melody, bass, motif, and percussion tracks are created. We propose a genetic algorithm for the generation of melodic elements. The algorithm incorporates mutations with musical significance and a fitness function based on normal distribution and predefined musical feature values. The values adaptively evolve, influenced by emotional parameters and distinct playing styles. The system for generating percussion in any time signature utilises probabilistic methods, including Markov chains. Through both human and objective evaluations, we demonstrate that our music generation approach outperforms baselines on specific, musically meaningful metrics, offering a viable alternative to purely neural network-based systems. |
format | article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3107311485</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3107311485</sourcerecordid><originalsourceid>FETCH-proquest_journals_31073114853</originalsourceid><addsrcrecordid>eNqNjk1qw0AMhU2h0NDmDoJsEqjB9sRJ6LY_aRaGLLwMDeN47Cgdz7QjGVqKL9kTZdz4AF1J79PTk66CUSJEHK7mSXITjIlOURQli2WSpmIU_E53rL74J-veLs2iG8h6m3ezAYpuN3uAtTLKSUZTQ9ZqRnby8A6ZLbFCWWh1oWGGpmUvNk8bTwgPUDnbQO6DwEu_3Qex51LX1iEfG7qHrbOFLFAj9ZNG8dGWBNKU4B_pjyhNgMaT795bO0WE1vw5cmwUENZGcuvUXXBdSU1qPNTbYPLynD--hh_OfraKeH-yrTN-tBdxtBRxPF-l4n-uMyv1bt8</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3107311485</pqid></control><display><type>article</type><title>(\text{M}^\text{6}(\text{GPT})^\text{3}\): Generating Multitrack Modifiable Multi-Minute MIDI Music from Text using Genetic algorithms, Probabilistic methods and GPT Models in any Progression and Time signature</title><source>Publicly Available Content Database</source><creator>Poćwiardowski, Jakub ; Modrzejewski, Mateusz ; Tatara, Marek S</creator><creatorcontrib>Poćwiardowski, Jakub ; Modrzejewski, Mateusz ; Tatara, Marek S</creatorcontrib><description>This work introduces the \(\text{M}^\text{6}(\text{GPT})^\text{3}\) composer system, capable of generating complete, multi-minute musical compositions with complex structures in any time signature, in the MIDI domain from input descriptions in natural language. The system utilizes an autoregressive transformer language model to map natural language prompts to composition parameters in JSON format. The defined structure includes time signature, scales, chord progressions, and valence-arousal values, from which accompaniment, melody, bass, motif, and percussion tracks are created. We propose a genetic algorithm for the generation of melodic elements. The algorithm incorporates mutations with musical significance and a fitness function based on normal distribution and predefined musical feature values. The values adaptively evolve, influenced by emotional parameters and distinct playing styles. The system for generating percussion in any time signature utilises probabilistic methods, including Markov chains. Through both human and objective evaluations, we demonstrate that our music generation approach outperforms baselines on specific, musically meaningful metrics, offering a viable alternative to purely neural network-based systems.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Arousal ; Composition ; Genetic algorithms ; Markov chains ; Natural language ; Natural language processing ; Neural networks ; Normal distribution ; Parameters ; Percussion ; Probabilistic methods ; Statistical analysis</subject><ispartof>arXiv.org, 2024-11</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by-nc-sa/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/3107311485?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>776,780,25728,36986,44563</link.rule.ids></links><search><creatorcontrib>Poćwiardowski, Jakub</creatorcontrib><creatorcontrib>Modrzejewski, Mateusz</creatorcontrib><creatorcontrib>Tatara, Marek S</creatorcontrib><title>(\text{M}^\text{6}(\text{GPT})^\text{3}\): Generating Multitrack Modifiable Multi-Minute MIDI Music from Text using Genetic algorithms, Probabilistic methods and GPT Models in any Progression and Time signature</title><title>arXiv.org</title><description>This work introduces the \(\text{M}^\text{6}(\text{GPT})^\text{3}\) composer system, capable of generating complete, multi-minute musical compositions with complex structures in any time signature, in the MIDI domain from input descriptions in natural language. The system utilizes an autoregressive transformer language model to map natural language prompts to composition parameters in JSON format. The defined structure includes time signature, scales, chord progressions, and valence-arousal values, from which accompaniment, melody, bass, motif, and percussion tracks are created. We propose a genetic algorithm for the generation of melodic elements. The algorithm incorporates mutations with musical significance and a fitness function based on normal distribution and predefined musical feature values. The values adaptively evolve, influenced by emotional parameters and distinct playing styles. The system for generating percussion in any time signature utilises probabilistic methods, including Markov chains. Through both human and objective evaluations, we demonstrate that our music generation approach outperforms baselines on specific, musically meaningful metrics, offering a viable alternative to purely neural network-based systems.</description><subject>Arousal</subject><subject>Composition</subject><subject>Genetic algorithms</subject><subject>Markov chains</subject><subject>Natural language</subject><subject>Natural language processing</subject><subject>Neural networks</subject><subject>Normal distribution</subject><subject>Parameters</subject><subject>Percussion</subject><subject>Probabilistic methods</subject><subject>Statistical analysis</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNjk1qw0AMhU2h0NDmDoJsEqjB9sRJ6LY_aRaGLLwMDeN47Cgdz7QjGVqKL9kTZdz4AF1J79PTk66CUSJEHK7mSXITjIlOURQli2WSpmIU_E53rL74J-veLs2iG8h6m3ezAYpuN3uAtTLKSUZTQ9ZqRnby8A6ZLbFCWWh1oWGGpmUvNk8bTwgPUDnbQO6DwEu_3Qex51LX1iEfG7qHrbOFLFAj9ZNG8dGWBNKU4B_pjyhNgMaT795bO0WE1vw5cmwUENZGcuvUXXBdSU1qPNTbYPLynD--hh_OfraKeH-yrTN-tBdxtBRxPF-l4n-uMyv1bt8</recordid><startdate>20241129</startdate><enddate>20241129</enddate><creator>Poćwiardowski, Jakub</creator><creator>Modrzejewski, Mateusz</creator><creator>Tatara, Marek S</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PHGZM</scope><scope>PHGZT</scope><scope>PIMPY</scope><scope>PKEHL</scope><scope>PQEST</scope><scope>PQGLB</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20241129</creationdate><title>(\text{M}^\text{6}(\text{GPT})^\text{3}\): Generating Multitrack Modifiable Multi-Minute MIDI Music from Text using Genetic algorithms, Probabilistic methods and GPT Models in any Progression and Time signature</title><author>Poćwiardowski, Jakub ; Modrzejewski, Mateusz ; Tatara, Marek S</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_31073114853</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Arousal</topic><topic>Composition</topic><topic>Genetic algorithms</topic><topic>Markov chains</topic><topic>Natural language</topic><topic>Natural language processing</topic><topic>Neural networks</topic><topic>Normal distribution</topic><topic>Parameters</topic><topic>Percussion</topic><topic>Probabilistic methods</topic><topic>Statistical analysis</topic><toplevel>online_resources</toplevel><creatorcontrib>Poćwiardowski, Jakub</creatorcontrib><creatorcontrib>Modrzejewski, Mateusz</creatorcontrib><creatorcontrib>Tatara, Marek S</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>ProQuest Central (New)</collection><collection>ProQuest One Academic (New)</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Middle East (New)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Applied & Life Sciences</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Poćwiardowski, Jakub</au><au>Modrzejewski, Mateusz</au><au>Tatara, Marek S</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>(\text{M}^\text{6}(\text{GPT})^\text{3}\): Generating Multitrack Modifiable Multi-Minute MIDI Music from Text using Genetic algorithms, Probabilistic methods and GPT Models in any Progression and Time signature</atitle><jtitle>arXiv.org</jtitle><date>2024-11-29</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>This work introduces the \(\text{M}^\text{6}(\text{GPT})^\text{3}\) composer system, capable of generating complete, multi-minute musical compositions with complex structures in any time signature, in the MIDI domain from input descriptions in natural language. The system utilizes an autoregressive transformer language model to map natural language prompts to composition parameters in JSON format. The defined structure includes time signature, scales, chord progressions, and valence-arousal values, from which accompaniment, melody, bass, motif, and percussion tracks are created. We propose a genetic algorithm for the generation of melodic elements. The algorithm incorporates mutations with musical significance and a fitness function based on normal distribution and predefined musical feature values. The values adaptively evolve, influenced by emotional parameters and distinct playing styles. The system for generating percussion in any time signature utilises probabilistic methods, including Markov chains. Through both human and objective evaluations, we demonstrate that our music generation approach outperforms baselines on specific, musically meaningful metrics, offering a viable alternative to purely neural network-based systems.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2024-11 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_3107311485 |
source | Publicly Available Content Database |
subjects | Arousal Composition Genetic algorithms Markov chains Natural language Natural language processing Neural networks Normal distribution Parameters Percussion Probabilistic methods Statistical analysis |
title | (\text{M}^\text{6}(\text{GPT})^\text{3}\): Generating Multitrack Modifiable Multi-Minute MIDI Music from Text using Genetic algorithms, Probabilistic methods and GPT Models in any Progression and Time signature |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-25T19%3A53%3A06IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=(%5Ctext%7BM%7D%5E%5Ctext%7B6%7D(%5Ctext%7BGPT%7D)%5E%5Ctext%7B3%7D%5C):%20Generating%20Multitrack%20Modifiable%20Multi-Minute%20MIDI%20Music%20from%20Text%20using%20Genetic%20algorithms,%20Probabilistic%20methods%20and%20GPT%20Models%20in%20any%20Progression%20and%20Time%20signature&rft.jtitle=arXiv.org&rft.au=Po%C4%87wiardowski,%20Jakub&rft.date=2024-11-29&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3107311485%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_31073114853%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3107311485&rft_id=info:pmid/&rfr_iscdi=true |