Loading…
Subsample, Generate, and Stack Using the Spiral Discovery Method: A Framework for Autoregressive Data Compression and Augmentation
This article addresses the challenge of efficiently managing datasets of various sizes through two key strategies: 1) dataset compression and 2) synthetic augmentation. This article introduces a novel framework, referred to as subsample, generate, and stack (SGS), which can be used to implement both...
Saved in:
Published in: | IEEE transactions on systems, man, and cybernetics. Systems man, and cybernetics. Systems, 2024-11, Vol.54 (11), p.7129-7142 |
---|---|
Main Author: | |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | cdi_FETCH-LOGICAL-c148t-18be47a94f1ffa16a096fb2542268eecc74c6002cdaf3b32001cd9a16556cd1a3 |
container_end_page | 7142 |
container_issue | 11 |
container_start_page | 7129 |
container_title | IEEE transactions on systems, man, and cybernetics. Systems |
container_volume | 54 |
creator | Csapo, Adam B. |
description | This article addresses the challenge of efficiently managing datasets of various sizes through two key strategies: 1) dataset compression and 2) synthetic augmentation. This article introduces a novel framework, referred to as subsample, generate, and stack (SGS), which can be used to implement both of these strategies while maintaining the statistical characteristics of the original data. While SGS can be paired with a variety of generative methods, this article specifically demonstrates its application using the spiral discovery method (SDM)-an autoregressive data generation model that allows for the exploratory manipulation of numerical data. The uniqueness and widespread applicability of this approach stems from its support for the fine-grained optimization of exploration versus exploitation goals through an interpretable set of hyperparameters. The effectiveness of the SGS framework combined with SDM is validated on two benchmark examples-one focusing on compression and the other on augmentation-showcasing its potential as a tool for dataset management in engineering contexts. |
doi_str_mv | 10.1109/TSMC.2024.3448206 |
format | article |
fullrecord | <record><control><sourceid>crossref_ieee_</sourceid><recordid>TN_cdi_ieee_primary_10666739</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10666739</ieee_id><sourcerecordid>10_1109_TSMC_2024_3448206</sourcerecordid><originalsourceid>FETCH-LOGICAL-c148t-18be47a94f1ffa16a096fb2542268eecc74c6002cdaf3b32001cd9a16556cd1a3</originalsourceid><addsrcrecordid>eNpNkF9PwjAUxRujiQT5ACY-9AM4bLtSNt-WIWgC8WHwvHTdLUzYurQFw6uf3PEnxqd7cnLOSe4PoUdKhpSS-GWZLdIhI4wPQ84jRsQN6jEqooCxkN3-aSru0cC5L0IIZZEIieihn2xfOFm3O3jGM2jASt8p2ZQ481Jt8cpVzRr7DeCsrazc4UnllDmAPeIF-I0pX3GCp1bW8G3sFmtjcbL3xsLagnPVAfBEeolTU7dnwzTn8WS_rqHx0nfGA7rTcudgcL19tJq-LdP3YP45-0iTeaAoj3xAowL4WMZcU60lFZLEQhdsxBkTEYBSY64EIUyVUodFyLonVRl3wdFIqJLKsI_oZVdZ45wFnbe2qqU95pTkJ475iWN-4phfOXadp0unAoB_eSHEOIzDX1BycLg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Subsample, Generate, and Stack Using the Spiral Discovery Method: A Framework for Autoregressive Data Compression and Augmentation</title><source>IEEE Xplore (Online service)</source><creator>Csapo, Adam B.</creator><creatorcontrib>Csapo, Adam B.</creatorcontrib><description>This article addresses the challenge of efficiently managing datasets of various sizes through two key strategies: 1) dataset compression and 2) synthetic augmentation. This article introduces a novel framework, referred to as subsample, generate, and stack (SGS), which can be used to implement both of these strategies while maintaining the statistical characteristics of the original data. While SGS can be paired with a variety of generative methods, this article specifically demonstrates its application using the spiral discovery method (SDM)-an autoregressive data generation model that allows for the exploratory manipulation of numerical data. The uniqueness and widespread applicability of this approach stems from its support for the fine-grained optimization of exploration versus exploitation goals through an interpretable set of hyperparameters. The effectiveness of the SGS framework combined with SDM is validated on two benchmark examples-one focusing on compression and the other on augmentation-showcasing its potential as a tool for dataset management in engineering contexts.</description><identifier>ISSN: 2168-2216</identifier><identifier>EISSN: 2168-2232</identifier><identifier>DOI: 10.1109/TSMC.2024.3448206</identifier><identifier>CODEN: ITSMFE</identifier><language>eng</language><publisher>IEEE</publisher><subject>Data compression ; Data models ; Dataset augmentation ; dataset compression ; Interpolation ; Mathematical models ; spiral discovery method (SDM) ; spiral optimization ; Spirals ; Training ; Training data</subject><ispartof>IEEE transactions on systems, man, and cybernetics. Systems, 2024-11, Vol.54 (11), p.7129-7142</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c148t-18be47a94f1ffa16a096fb2542268eecc74c6002cdaf3b32001cd9a16556cd1a3</cites><orcidid>0000-0001-9885-137X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10666739$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids></links><search><creatorcontrib>Csapo, Adam B.</creatorcontrib><title>Subsample, Generate, and Stack Using the Spiral Discovery Method: A Framework for Autoregressive Data Compression and Augmentation</title><title>IEEE transactions on systems, man, and cybernetics. Systems</title><addtitle>TSMC</addtitle><description>This article addresses the challenge of efficiently managing datasets of various sizes through two key strategies: 1) dataset compression and 2) synthetic augmentation. This article introduces a novel framework, referred to as subsample, generate, and stack (SGS), which can be used to implement both of these strategies while maintaining the statistical characteristics of the original data. While SGS can be paired with a variety of generative methods, this article specifically demonstrates its application using the spiral discovery method (SDM)-an autoregressive data generation model that allows for the exploratory manipulation of numerical data. The uniqueness and widespread applicability of this approach stems from its support for the fine-grained optimization of exploration versus exploitation goals through an interpretable set of hyperparameters. The effectiveness of the SGS framework combined with SDM is validated on two benchmark examples-one focusing on compression and the other on augmentation-showcasing its potential as a tool for dataset management in engineering contexts.</description><subject>Data compression</subject><subject>Data models</subject><subject>Dataset augmentation</subject><subject>dataset compression</subject><subject>Interpolation</subject><subject>Mathematical models</subject><subject>spiral discovery method (SDM)</subject><subject>spiral optimization</subject><subject>Spirals</subject><subject>Training</subject><subject>Training data</subject><issn>2168-2216</issn><issn>2168-2232</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNpNkF9PwjAUxRujiQT5ACY-9AM4bLtSNt-WIWgC8WHwvHTdLUzYurQFw6uf3PEnxqd7cnLOSe4PoUdKhpSS-GWZLdIhI4wPQ84jRsQN6jEqooCxkN3-aSru0cC5L0IIZZEIieihn2xfOFm3O3jGM2jASt8p2ZQ481Jt8cpVzRr7DeCsrazc4UnllDmAPeIF-I0pX3GCp1bW8G3sFmtjcbL3xsLagnPVAfBEeolTU7dnwzTn8WS_rqHx0nfGA7rTcudgcL19tJq-LdP3YP45-0iTeaAoj3xAowL4WMZcU60lFZLEQhdsxBkTEYBSY64EIUyVUodFyLonVRl3wdFIqJLKsI_oZVdZ45wFnbe2qqU95pTkJ475iWN-4phfOXadp0unAoB_eSHEOIzDX1BycLg</recordid><startdate>202411</startdate><enddate>202411</enddate><creator>Csapo, Adam B.</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0001-9885-137X</orcidid></search><sort><creationdate>202411</creationdate><title>Subsample, Generate, and Stack Using the Spiral Discovery Method: A Framework for Autoregressive Data Compression and Augmentation</title><author>Csapo, Adam B.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c148t-18be47a94f1ffa16a096fb2542268eecc74c6002cdaf3b32001cd9a16556cd1a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Data compression</topic><topic>Data models</topic><topic>Dataset augmentation</topic><topic>dataset compression</topic><topic>Interpolation</topic><topic>Mathematical models</topic><topic>spiral discovery method (SDM)</topic><topic>spiral optimization</topic><topic>Spirals</topic><topic>Training</topic><topic>Training data</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Csapo, Adam B.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Xplore</collection><collection>CrossRef</collection><jtitle>IEEE transactions on systems, man, and cybernetics. Systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Csapo, Adam B.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Subsample, Generate, and Stack Using the Spiral Discovery Method: A Framework for Autoregressive Data Compression and Augmentation</atitle><jtitle>IEEE transactions on systems, man, and cybernetics. Systems</jtitle><stitle>TSMC</stitle><date>2024-11</date><risdate>2024</risdate><volume>54</volume><issue>11</issue><spage>7129</spage><epage>7142</epage><pages>7129-7142</pages><issn>2168-2216</issn><eissn>2168-2232</eissn><coden>ITSMFE</coden><abstract>This article addresses the challenge of efficiently managing datasets of various sizes through two key strategies: 1) dataset compression and 2) synthetic augmentation. This article introduces a novel framework, referred to as subsample, generate, and stack (SGS), which can be used to implement both of these strategies while maintaining the statistical characteristics of the original data. While SGS can be paired with a variety of generative methods, this article specifically demonstrates its application using the spiral discovery method (SDM)-an autoregressive data generation model that allows for the exploratory manipulation of numerical data. The uniqueness and widespread applicability of this approach stems from its support for the fine-grained optimization of exploration versus exploitation goals through an interpretable set of hyperparameters. The effectiveness of the SGS framework combined with SDM is validated on two benchmark examples-one focusing on compression and the other on augmentation-showcasing its potential as a tool for dataset management in engineering contexts.</abstract><pub>IEEE</pub><doi>10.1109/TSMC.2024.3448206</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0001-9885-137X</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2168-2216 |
ispartof | IEEE transactions on systems, man, and cybernetics. Systems, 2024-11, Vol.54 (11), p.7129-7142 |
issn | 2168-2216 2168-2232 |
language | eng |
recordid | cdi_ieee_primary_10666739 |
source | IEEE Xplore (Online service) |
subjects | Data compression Data models Dataset augmentation dataset compression Interpolation Mathematical models spiral discovery method (SDM) spiral optimization Spirals Training Training data |
title | Subsample, Generate, and Stack Using the Spiral Discovery Method: A Framework for Autoregressive Data Compression and Augmentation |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-30T21%3A05%3A34IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Subsample,%20Generate,%20and%20Stack%20Using%20the%20Spiral%20Discovery%20Method:%20A%20Framework%20for%20Autoregressive%20Data%20Compression%20and%20Augmentation&rft.jtitle=IEEE%20transactions%20on%20systems,%20man,%20and%20cybernetics.%20Systems&rft.au=Csapo,%20Adam%20B.&rft.date=2024-11&rft.volume=54&rft.issue=11&rft.spage=7129&rft.epage=7142&rft.pages=7129-7142&rft.issn=2168-2216&rft.eissn=2168-2232&rft.coden=ITSMFE&rft_id=info:doi/10.1109/TSMC.2024.3448206&rft_dat=%3Ccrossref_ieee_%3E10_1109_TSMC_2024_3448206%3C/crossref_ieee_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c148t-18be47a94f1ffa16a096fb2542268eecc74c6002cdaf3b32001cd9a16556cd1a3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10666739&rfr_iscdi=true |