Loading…

Subsample, Generate, and Stack Using the Spiral Discovery Method: A Framework for Autoregressive Data Compression and Augmentation

This article addresses the challenge of efficiently managing datasets of various sizes through two key strategies: 1) dataset compression and 2) synthetic augmentation. This article introduces a novel framework, referred to as subsample, generate, and stack (SGS), which can be used to implement both...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on systems, man, and cybernetics. Systems man, and cybernetics. Systems, 2024-11, Vol.54 (11), p.7129-7142
Main Author: Csapo, Adam B.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites cdi_FETCH-LOGICAL-c148t-18be47a94f1ffa16a096fb2542268eecc74c6002cdaf3b32001cd9a16556cd1a3
container_end_page 7142
container_issue 11
container_start_page 7129
container_title IEEE transactions on systems, man, and cybernetics. Systems
container_volume 54
creator Csapo, Adam B.
description This article addresses the challenge of efficiently managing datasets of various sizes through two key strategies: 1) dataset compression and 2) synthetic augmentation. This article introduces a novel framework, referred to as subsample, generate, and stack (SGS), which can be used to implement both of these strategies while maintaining the statistical characteristics of the original data. While SGS can be paired with a variety of generative methods, this article specifically demonstrates its application using the spiral discovery method (SDM)-an autoregressive data generation model that allows for the exploratory manipulation of numerical data. The uniqueness and widespread applicability of this approach stems from its support for the fine-grained optimization of exploration versus exploitation goals through an interpretable set of hyperparameters. The effectiveness of the SGS framework combined with SDM is validated on two benchmark examples-one focusing on compression and the other on augmentation-showcasing its potential as a tool for dataset management in engineering contexts.
doi_str_mv 10.1109/TSMC.2024.3448206
format article
fullrecord <record><control><sourceid>crossref_ieee_</sourceid><recordid>TN_cdi_ieee_primary_10666739</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10666739</ieee_id><sourcerecordid>10_1109_TSMC_2024_3448206</sourcerecordid><originalsourceid>FETCH-LOGICAL-c148t-18be47a94f1ffa16a096fb2542268eecc74c6002cdaf3b32001cd9a16556cd1a3</originalsourceid><addsrcrecordid>eNpNkF9PwjAUxRujiQT5ACY-9AM4bLtSNt-WIWgC8WHwvHTdLUzYurQFw6uf3PEnxqd7cnLOSe4PoUdKhpSS-GWZLdIhI4wPQ84jRsQN6jEqooCxkN3-aSru0cC5L0IIZZEIieihn2xfOFm3O3jGM2jASt8p2ZQ481Jt8cpVzRr7DeCsrazc4UnllDmAPeIF-I0pX3GCp1bW8G3sFmtjcbL3xsLagnPVAfBEeolTU7dnwzTn8WS_rqHx0nfGA7rTcudgcL19tJq-LdP3YP45-0iTeaAoj3xAowL4WMZcU60lFZLEQhdsxBkTEYBSY64EIUyVUodFyLonVRl3wdFIqJLKsI_oZVdZ45wFnbe2qqU95pTkJ475iWN-4phfOXadp0unAoB_eSHEOIzDX1BycLg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Subsample, Generate, and Stack Using the Spiral Discovery Method: A Framework for Autoregressive Data Compression and Augmentation</title><source>IEEE Xplore (Online service)</source><creator>Csapo, Adam B.</creator><creatorcontrib>Csapo, Adam B.</creatorcontrib><description>This article addresses the challenge of efficiently managing datasets of various sizes through two key strategies: 1) dataset compression and 2) synthetic augmentation. This article introduces a novel framework, referred to as subsample, generate, and stack (SGS), which can be used to implement both of these strategies while maintaining the statistical characteristics of the original data. While SGS can be paired with a variety of generative methods, this article specifically demonstrates its application using the spiral discovery method (SDM)-an autoregressive data generation model that allows for the exploratory manipulation of numerical data. The uniqueness and widespread applicability of this approach stems from its support for the fine-grained optimization of exploration versus exploitation goals through an interpretable set of hyperparameters. The effectiveness of the SGS framework combined with SDM is validated on two benchmark examples-one focusing on compression and the other on augmentation-showcasing its potential as a tool for dataset management in engineering contexts.</description><identifier>ISSN: 2168-2216</identifier><identifier>EISSN: 2168-2232</identifier><identifier>DOI: 10.1109/TSMC.2024.3448206</identifier><identifier>CODEN: ITSMFE</identifier><language>eng</language><publisher>IEEE</publisher><subject>Data compression ; Data models ; Dataset augmentation ; dataset compression ; Interpolation ; Mathematical models ; spiral discovery method (SDM) ; spiral optimization ; Spirals ; Training ; Training data</subject><ispartof>IEEE transactions on systems, man, and cybernetics. Systems, 2024-11, Vol.54 (11), p.7129-7142</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c148t-18be47a94f1ffa16a096fb2542268eecc74c6002cdaf3b32001cd9a16556cd1a3</cites><orcidid>0000-0001-9885-137X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10666739$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids></links><search><creatorcontrib>Csapo, Adam B.</creatorcontrib><title>Subsample, Generate, and Stack Using the Spiral Discovery Method: A Framework for Autoregressive Data Compression and Augmentation</title><title>IEEE transactions on systems, man, and cybernetics. Systems</title><addtitle>TSMC</addtitle><description>This article addresses the challenge of efficiently managing datasets of various sizes through two key strategies: 1) dataset compression and 2) synthetic augmentation. This article introduces a novel framework, referred to as subsample, generate, and stack (SGS), which can be used to implement both of these strategies while maintaining the statistical characteristics of the original data. While SGS can be paired with a variety of generative methods, this article specifically demonstrates its application using the spiral discovery method (SDM)-an autoregressive data generation model that allows for the exploratory manipulation of numerical data. The uniqueness and widespread applicability of this approach stems from its support for the fine-grained optimization of exploration versus exploitation goals through an interpretable set of hyperparameters. The effectiveness of the SGS framework combined with SDM is validated on two benchmark examples-one focusing on compression and the other on augmentation-showcasing its potential as a tool for dataset management in engineering contexts.</description><subject>Data compression</subject><subject>Data models</subject><subject>Dataset augmentation</subject><subject>dataset compression</subject><subject>Interpolation</subject><subject>Mathematical models</subject><subject>spiral discovery method (SDM)</subject><subject>spiral optimization</subject><subject>Spirals</subject><subject>Training</subject><subject>Training data</subject><issn>2168-2216</issn><issn>2168-2232</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNpNkF9PwjAUxRujiQT5ACY-9AM4bLtSNt-WIWgC8WHwvHTdLUzYurQFw6uf3PEnxqd7cnLOSe4PoUdKhpSS-GWZLdIhI4wPQ84jRsQN6jEqooCxkN3-aSru0cC5L0IIZZEIieihn2xfOFm3O3jGM2jASt8p2ZQ481Jt8cpVzRr7DeCsrazc4UnllDmAPeIF-I0pX3GCp1bW8G3sFmtjcbL3xsLagnPVAfBEeolTU7dnwzTn8WS_rqHx0nfGA7rTcudgcL19tJq-LdP3YP45-0iTeaAoj3xAowL4WMZcU60lFZLEQhdsxBkTEYBSY64EIUyVUodFyLonVRl3wdFIqJLKsI_oZVdZ45wFnbe2qqU95pTkJ475iWN-4phfOXadp0unAoB_eSHEOIzDX1BycLg</recordid><startdate>202411</startdate><enddate>202411</enddate><creator>Csapo, Adam B.</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0001-9885-137X</orcidid></search><sort><creationdate>202411</creationdate><title>Subsample, Generate, and Stack Using the Spiral Discovery Method: A Framework for Autoregressive Data Compression and Augmentation</title><author>Csapo, Adam B.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c148t-18be47a94f1ffa16a096fb2542268eecc74c6002cdaf3b32001cd9a16556cd1a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Data compression</topic><topic>Data models</topic><topic>Dataset augmentation</topic><topic>dataset compression</topic><topic>Interpolation</topic><topic>Mathematical models</topic><topic>spiral discovery method (SDM)</topic><topic>spiral optimization</topic><topic>Spirals</topic><topic>Training</topic><topic>Training data</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Csapo, Adam B.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Xplore</collection><collection>CrossRef</collection><jtitle>IEEE transactions on systems, man, and cybernetics. Systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Csapo, Adam B.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Subsample, Generate, and Stack Using the Spiral Discovery Method: A Framework for Autoregressive Data Compression and Augmentation</atitle><jtitle>IEEE transactions on systems, man, and cybernetics. Systems</jtitle><stitle>TSMC</stitle><date>2024-11</date><risdate>2024</risdate><volume>54</volume><issue>11</issue><spage>7129</spage><epage>7142</epage><pages>7129-7142</pages><issn>2168-2216</issn><eissn>2168-2232</eissn><coden>ITSMFE</coden><abstract>This article addresses the challenge of efficiently managing datasets of various sizes through two key strategies: 1) dataset compression and 2) synthetic augmentation. This article introduces a novel framework, referred to as subsample, generate, and stack (SGS), which can be used to implement both of these strategies while maintaining the statistical characteristics of the original data. While SGS can be paired with a variety of generative methods, this article specifically demonstrates its application using the spiral discovery method (SDM)-an autoregressive data generation model that allows for the exploratory manipulation of numerical data. The uniqueness and widespread applicability of this approach stems from its support for the fine-grained optimization of exploration versus exploitation goals through an interpretable set of hyperparameters. The effectiveness of the SGS framework combined with SDM is validated on two benchmark examples-one focusing on compression and the other on augmentation-showcasing its potential as a tool for dataset management in engineering contexts.</abstract><pub>IEEE</pub><doi>10.1109/TSMC.2024.3448206</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0001-9885-137X</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 2168-2216
ispartof IEEE transactions on systems, man, and cybernetics. Systems, 2024-11, Vol.54 (11), p.7129-7142
issn 2168-2216
2168-2232
language eng
recordid cdi_ieee_primary_10666739
source IEEE Xplore (Online service)
subjects Data compression
Data models
Dataset augmentation
dataset compression
Interpolation
Mathematical models
spiral discovery method (SDM)
spiral optimization
Spirals
Training
Training data
title Subsample, Generate, and Stack Using the Spiral Discovery Method: A Framework for Autoregressive Data Compression and Augmentation
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-30T21%3A05%3A34IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Subsample,%20Generate,%20and%20Stack%20Using%20the%20Spiral%20Discovery%20Method:%20A%20Framework%20for%20Autoregressive%20Data%20Compression%20and%20Augmentation&rft.jtitle=IEEE%20transactions%20on%20systems,%20man,%20and%20cybernetics.%20Systems&rft.au=Csapo,%20Adam%20B.&rft.date=2024-11&rft.volume=54&rft.issue=11&rft.spage=7129&rft.epage=7142&rft.pages=7129-7142&rft.issn=2168-2216&rft.eissn=2168-2232&rft.coden=ITSMFE&rft_id=info:doi/10.1109/TSMC.2024.3448206&rft_dat=%3Ccrossref_ieee_%3E10_1109_TSMC_2024_3448206%3C/crossref_ieee_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c148t-18be47a94f1ffa16a096fb2542268eecc74c6002cdaf3b32001cd9a16556cd1a3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10666739&rfr_iscdi=true