Loading…

Leveraging Computational Storage for Power-Efficient Distributed Data Analytics

This article presents a family of computational storage drives (CSDs) and demonstrates their performance and power improvements due to in-storage processing (ISP) when running big data analytics applications. CSDs are an emerging class of solid state drives that are capable of running user code whil...

Full description

Saved in:
Bibliographic Details
Published in:ACM transactions on embedded computing systems 2022-10, Vol.21 (6), p.1-36, Article 82
Main Authors: HeydariGorji, Ali, Rezaei, Siavash, Torabzadehkashi, Mahdi, Bobarshad, Hossein, Alves, Vladimir, Chou, Pai H.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites cdi_FETCH-LOGICAL-a239t-41ed4c6c2de4336bd0cdc384fae3087b4046d61f2f6696eddc948a8ada7392da3
container_end_page 36
container_issue 6
container_start_page 1
container_title ACM transactions on embedded computing systems
container_volume 21
creator HeydariGorji, Ali
Rezaei, Siavash
Torabzadehkashi, Mahdi
Bobarshad, Hossein
Alves, Vladimir
Chou, Pai H.
description This article presents a family of computational storage drives (CSDs) and demonstrates their performance and power improvements due to in-storage processing (ISP) when running big data analytics applications. CSDs are an emerging class of solid state drives that are capable of running user code while minimizing data transfer time and energy. Applications that can benefit from in situ processing include distributed training, distributed inferencing, and databases. To achieve the full advantage of the proposed ISP architecture, we propose software solutions for workload balancing before and at runtime for training and inferencing applications. Other applications such as sharding-based databases can readily take advantage of our ISP structure without additional tooling. Experimental results on different capacity and form factors of CSDs show up to 3.1× speedup in processing while reducing the energy consumption and data transfer by up to 67% and 68%, respectively, compared to regular enterprise solid state drives.
doi_str_mv 10.1145/3528577
format article
fullrecord <record><control><sourceid>acm_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1145_3528577</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3528577</sourcerecordid><originalsourceid>FETCH-LOGICAL-a239t-41ed4c6c2de4336bd0cdc384fae3087b4046d61f2f6696eddc948a8ada7392da3</originalsourceid><addsrcrecordid>eNo9kEtLAzEUhYMoWKu4d5Wdq2gyeUxmWdr6gIEK6nq4k0eJdGZKkir9905pdXUv93z3cDgI3TL6wJiQj1wWWpblGZowKTXhQsnzw84rUlFdXqKrlL4oZWUh5AStavftIqxDv8bzodvuMuQw9LDB73kY7w77IeK34cdFsvQ-mOD6jBch5RjaXXYWLyADno0f-xxMukYXHjbJ3ZzmFH0-LT_mL6RePb_OZzWBgleZCOasMMoU1gnOVWupsYZr4cHxMWQrqFBWMV94pSrlrDWV0KDBQsmrwgKfovujr4lDStH5ZhtDB3HfMNocemhOPYzk3ZEE0_1Df-IvgN5ZBg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Leveraging Computational Storage for Power-Efficient Distributed Data Analytics</title><source>Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list)</source><creator>HeydariGorji, Ali ; Rezaei, Siavash ; Torabzadehkashi, Mahdi ; Bobarshad, Hossein ; Alves, Vladimir ; Chou, Pai H.</creator><creatorcontrib>HeydariGorji, Ali ; Rezaei, Siavash ; Torabzadehkashi, Mahdi ; Bobarshad, Hossein ; Alves, Vladimir ; Chou, Pai H.</creatorcontrib><description>This article presents a family of computational storage drives (CSDs) and demonstrates their performance and power improvements due to in-storage processing (ISP) when running big data analytics applications. CSDs are an emerging class of solid state drives that are capable of running user code while minimizing data transfer time and energy. Applications that can benefit from in situ processing include distributed training, distributed inferencing, and databases. To achieve the full advantage of the proposed ISP architecture, we propose software solutions for workload balancing before and at runtime for training and inferencing applications. Other applications such as sharding-based databases can readily take advantage of our ISP structure without additional tooling. Experimental results on different capacity and form factors of CSDs show up to 3.1× speedup in processing while reducing the energy consumption and data transfer by up to 67% and 68%, respectively, compared to regular enterprise solid state drives.</description><identifier>ISSN: 1539-9087</identifier><identifier>EISSN: 1558-3465</identifier><identifier>DOI: 10.1145/3528577</identifier><language>eng</language><publisher>New York, NY: ACM</publisher><subject>Computer systems organization ; Computing methodologies ; External storage ; Hardware ; Peer-to-peer architectures ; Self-organization</subject><ispartof>ACM transactions on embedded computing systems, 2022-10, Vol.21 (6), p.1-36, Article 82</ispartof><rights>Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-a239t-41ed4c6c2de4336bd0cdc384fae3087b4046d61f2f6696eddc948a8ada7392da3</cites><orcidid>0000-0002-6218-8745</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,778,782,27907,27908</link.rule.ids></links><search><creatorcontrib>HeydariGorji, Ali</creatorcontrib><creatorcontrib>Rezaei, Siavash</creatorcontrib><creatorcontrib>Torabzadehkashi, Mahdi</creatorcontrib><creatorcontrib>Bobarshad, Hossein</creatorcontrib><creatorcontrib>Alves, Vladimir</creatorcontrib><creatorcontrib>Chou, Pai H.</creatorcontrib><title>Leveraging Computational Storage for Power-Efficient Distributed Data Analytics</title><title>ACM transactions on embedded computing systems</title><addtitle>ACM TECS</addtitle><description>This article presents a family of computational storage drives (CSDs) and demonstrates their performance and power improvements due to in-storage processing (ISP) when running big data analytics applications. CSDs are an emerging class of solid state drives that are capable of running user code while minimizing data transfer time and energy. Applications that can benefit from in situ processing include distributed training, distributed inferencing, and databases. To achieve the full advantage of the proposed ISP architecture, we propose software solutions for workload balancing before and at runtime for training and inferencing applications. Other applications such as sharding-based databases can readily take advantage of our ISP structure without additional tooling. Experimental results on different capacity and form factors of CSDs show up to 3.1× speedup in processing while reducing the energy consumption and data transfer by up to 67% and 68%, respectively, compared to regular enterprise solid state drives.</description><subject>Computer systems organization</subject><subject>Computing methodologies</subject><subject>External storage</subject><subject>Hardware</subject><subject>Peer-to-peer architectures</subject><subject>Self-organization</subject><issn>1539-9087</issn><issn>1558-3465</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNo9kEtLAzEUhYMoWKu4d5Wdq2gyeUxmWdr6gIEK6nq4k0eJdGZKkir9905pdXUv93z3cDgI3TL6wJiQj1wWWpblGZowKTXhQsnzw84rUlFdXqKrlL4oZWUh5AStavftIqxDv8bzodvuMuQw9LDB73kY7w77IeK34cdFsvQ-mOD6jBch5RjaXXYWLyADno0f-xxMukYXHjbJ3ZzmFH0-LT_mL6RePb_OZzWBgleZCOasMMoU1gnOVWupsYZr4cHxMWQrqFBWMV94pSrlrDWV0KDBQsmrwgKfovujr4lDStH5ZhtDB3HfMNocemhOPYzk3ZEE0_1Df-IvgN5ZBg</recordid><startdate>20221018</startdate><enddate>20221018</enddate><creator>HeydariGorji, Ali</creator><creator>Rezaei, Siavash</creator><creator>Torabzadehkashi, Mahdi</creator><creator>Bobarshad, Hossein</creator><creator>Alves, Vladimir</creator><creator>Chou, Pai H.</creator><general>ACM</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-6218-8745</orcidid></search><sort><creationdate>20221018</creationdate><title>Leveraging Computational Storage for Power-Efficient Distributed Data Analytics</title><author>HeydariGorji, Ali ; Rezaei, Siavash ; Torabzadehkashi, Mahdi ; Bobarshad, Hossein ; Alves, Vladimir ; Chou, Pai H.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a239t-41ed4c6c2de4336bd0cdc384fae3087b4046d61f2f6696eddc948a8ada7392da3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Computer systems organization</topic><topic>Computing methodologies</topic><topic>External storage</topic><topic>Hardware</topic><topic>Peer-to-peer architectures</topic><topic>Self-organization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>HeydariGorji, Ali</creatorcontrib><creatorcontrib>Rezaei, Siavash</creatorcontrib><creatorcontrib>Torabzadehkashi, Mahdi</creatorcontrib><creatorcontrib>Bobarshad, Hossein</creatorcontrib><creatorcontrib>Alves, Vladimir</creatorcontrib><creatorcontrib>Chou, Pai H.</creatorcontrib><collection>CrossRef</collection><jtitle>ACM transactions on embedded computing systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>HeydariGorji, Ali</au><au>Rezaei, Siavash</au><au>Torabzadehkashi, Mahdi</au><au>Bobarshad, Hossein</au><au>Alves, Vladimir</au><au>Chou, Pai H.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Leveraging Computational Storage for Power-Efficient Distributed Data Analytics</atitle><jtitle>ACM transactions on embedded computing systems</jtitle><stitle>ACM TECS</stitle><date>2022-10-18</date><risdate>2022</risdate><volume>21</volume><issue>6</issue><spage>1</spage><epage>36</epage><pages>1-36</pages><artnum>82</artnum><issn>1539-9087</issn><eissn>1558-3465</eissn><abstract>This article presents a family of computational storage drives (CSDs) and demonstrates their performance and power improvements due to in-storage processing (ISP) when running big data analytics applications. CSDs are an emerging class of solid state drives that are capable of running user code while minimizing data transfer time and energy. Applications that can benefit from in situ processing include distributed training, distributed inferencing, and databases. To achieve the full advantage of the proposed ISP architecture, we propose software solutions for workload balancing before and at runtime for training and inferencing applications. Other applications such as sharding-based databases can readily take advantage of our ISP structure without additional tooling. Experimental results on different capacity and form factors of CSDs show up to 3.1× speedup in processing while reducing the energy consumption and data transfer by up to 67% and 68%, respectively, compared to regular enterprise solid state drives.</abstract><cop>New York, NY</cop><pub>ACM</pub><doi>10.1145/3528577</doi><tpages>36</tpages><orcidid>https://orcid.org/0000-0002-6218-8745</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1539-9087
ispartof ACM transactions on embedded computing systems, 2022-10, Vol.21 (6), p.1-36, Article 82
issn 1539-9087
1558-3465
language eng
recordid cdi_crossref_primary_10_1145_3528577
source Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list)
subjects Computer systems organization
Computing methodologies
External storage
Hardware
Peer-to-peer architectures
Self-organization
title Leveraging Computational Storage for Power-Efficient Distributed Data Analytics
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-16T16%3A48%3A03IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-acm_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Leveraging%20Computational%20Storage%20for%20Power-Efficient%20Distributed%20Data%20Analytics&rft.jtitle=ACM%20transactions%20on%20embedded%20computing%20systems&rft.au=HeydariGorji,%20Ali&rft.date=2022-10-18&rft.volume=21&rft.issue=6&rft.spage=1&rft.epage=36&rft.pages=1-36&rft.artnum=82&rft.issn=1539-9087&rft.eissn=1558-3465&rft_id=info:doi/10.1145/3528577&rft_dat=%3Cacm_cross%3E3528577%3C/acm_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-a239t-41ed4c6c2de4336bd0cdc384fae3087b4046d61f2f6696eddc948a8ada7392da3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true