Loading…
FLEX: Introducing FLEXible Execution on CGRA with Spatio-Temporal Vector Dataflow
Coarse-Grained Reconfigurable Arrays (CGRAs) are well-suited to resource-constrained edge devices due to their optimal combination of performance, energy efficiency, and adaptability. However, CGRAs typically follow a rigid execution model - either spatio-temporal or spatial - irrespective of the wo...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | 9 |
container_issue | |
container_start_page | 1 |
container_title | |
container_volume | |
creator | Bandara, Thilini Kaushalya Wu, Dan Juneja, Rohan Wijerathne, Dhananjaya Mitra, Tulika Peh, Li-Shiuan |
description | Coarse-Grained Reconfigurable Arrays (CGRAs) are well-suited to resource-constrained edge devices due to their optimal combination of performance, energy efficiency, and adaptability. However, CGRAs typically follow a rigid execution model - either spatio-temporal or spatial - irrespective of the workload, limiting their efficiency. Spatio-temporal execution requires per-cycle reconfiguration, resulting in higher energy consumption. Conversely, spatial execution maintains the same configuration over a longer period; but this fixed mapping constraint can hinder the performance of complex applications and increase data memory accesses, leading to higher energy consumption. We introduce FLEX, a CGRA with a novel, flexible spatio-temporal vector dataflow execution model. This model processes a vector of data sequentially and chains them spatio-temporally. FLEX also supports variable vector lengths determined at compile time, enabling a more flexible execution paradigm. Our execution model reduces the reconfiguration frequency inherent in purely spatio-temporal mapping and mitigates the performance limitations and extra data memory accesses associated with purely spatial mapping. FLEX matches the performance of spatio-temporal CGRA but with 45% less energy and a 1.9 ×power efficiency improvement. Moreover, compared to a baseline spatial CGRA, FLEX consumes 35% less energy and delivers a 1.6× improvement in power efficiency at 1.5× higher throughput. |
doi_str_mv | 10.1109/ICCAD57390.2023.10323612 |
format | conference_proceeding |
fullrecord | <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_10323612</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10323612</ieee_id><sourcerecordid>10323612</sourcerecordid><originalsourceid>FETCH-LOGICAL-i204t-f45180e840c0dbc1f5032771609a541fd69e777d4de3bfd6a7abb90df06289763</originalsourceid><addsrcrecordid>eNo1UF9LwzAcjILgnPsGPuQLtP7yv_GtdN0cFESd4ttIm1QjXVvajOm3t0OFg-MO7jgOIUwgJgT07SbL0qVQTENMgbKYAKNMEnqGFlrphIlJUyrEOZoRIZKIcsYv0dU4fgJMgUTO0OOqyN_u8KYNQ2cPlW_f8cnxZeNw_uWqQ_Bdiydk66cUH334wM-9mcxo6_Z9N5gGv7oqdANemmDqpjteo4vaNKNb_PEcvazybXYfFQ_rTZYWkafAQ1RzQRJwCYcKbFmR-jRWKSJBG8FJbaV2SinLrWPlpIwyZanB1iBpopVkc3Tz2-udc7t-8HszfO_-L2A_e65Pmw</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>FLEX: Introducing FLEXible Execution on CGRA with Spatio-Temporal Vector Dataflow</title><source>IEEE Xplore All Conference Series</source><creator>Bandara, Thilini Kaushalya ; Wu, Dan ; Juneja, Rohan ; Wijerathne, Dhananjaya ; Mitra, Tulika ; Peh, Li-Shiuan</creator><creatorcontrib>Bandara, Thilini Kaushalya ; Wu, Dan ; Juneja, Rohan ; Wijerathne, Dhananjaya ; Mitra, Tulika ; Peh, Li-Shiuan</creatorcontrib><description>Coarse-Grained Reconfigurable Arrays (CGRAs) are well-suited to resource-constrained edge devices due to their optimal combination of performance, energy efficiency, and adaptability. However, CGRAs typically follow a rigid execution model - either spatio-temporal or spatial - irrespective of the workload, limiting their efficiency. Spatio-temporal execution requires per-cycle reconfiguration, resulting in higher energy consumption. Conversely, spatial execution maintains the same configuration over a longer period; but this fixed mapping constraint can hinder the performance of complex applications and increase data memory accesses, leading to higher energy consumption. We introduce FLEX, a CGRA with a novel, flexible spatio-temporal vector dataflow execution model. This model processes a vector of data sequentially and chains them spatio-temporally. FLEX also supports variable vector lengths determined at compile time, enabling a more flexible execution paradigm. Our execution model reduces the reconfiguration frequency inherent in purely spatio-temporal mapping and mitigates the performance limitations and extra data memory accesses associated with purely spatial mapping. FLEX matches the performance of spatio-temporal CGRA but with 45% less energy and a 1.9 ×power efficiency improvement. Moreover, compared to a baseline spatial CGRA, FLEX consumes 35% less energy and delivers a 1.6× improvement in power efficiency at 1.5× higher throughput.</description><identifier>EISSN: 1558-2434</identifier><identifier>EISBN: 9798350322255</identifier><identifier>DOI: 10.1109/ICCAD57390.2023.10323612</identifier><language>eng</language><publisher>IEEE</publisher><subject>Adaptive arrays ; Coarse Grained Reconfigurable Array (CGRA) ; Edge acceleration ; Energy consumption ; Flexible printed circuits ; Limiting ; Memory management ; Performance evaluation ; Throughput ; Vector dataflow</subject><ispartof>2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD), 2023, p.1-9</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10323612$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,27925,54555,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10323612$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Bandara, Thilini Kaushalya</creatorcontrib><creatorcontrib>Wu, Dan</creatorcontrib><creatorcontrib>Juneja, Rohan</creatorcontrib><creatorcontrib>Wijerathne, Dhananjaya</creatorcontrib><creatorcontrib>Mitra, Tulika</creatorcontrib><creatorcontrib>Peh, Li-Shiuan</creatorcontrib><title>FLEX: Introducing FLEXible Execution on CGRA with Spatio-Temporal Vector Dataflow</title><title>2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD)</title><addtitle>ICCAD</addtitle><description>Coarse-Grained Reconfigurable Arrays (CGRAs) are well-suited to resource-constrained edge devices due to their optimal combination of performance, energy efficiency, and adaptability. However, CGRAs typically follow a rigid execution model - either spatio-temporal or spatial - irrespective of the workload, limiting their efficiency. Spatio-temporal execution requires per-cycle reconfiguration, resulting in higher energy consumption. Conversely, spatial execution maintains the same configuration over a longer period; but this fixed mapping constraint can hinder the performance of complex applications and increase data memory accesses, leading to higher energy consumption. We introduce FLEX, a CGRA with a novel, flexible spatio-temporal vector dataflow execution model. This model processes a vector of data sequentially and chains them spatio-temporally. FLEX also supports variable vector lengths determined at compile time, enabling a more flexible execution paradigm. Our execution model reduces the reconfiguration frequency inherent in purely spatio-temporal mapping and mitigates the performance limitations and extra data memory accesses associated with purely spatial mapping. FLEX matches the performance of spatio-temporal CGRA but with 45% less energy and a 1.9 ×power efficiency improvement. Moreover, compared to a baseline spatial CGRA, FLEX consumes 35% less energy and delivers a 1.6× improvement in power efficiency at 1.5× higher throughput.</description><subject>Adaptive arrays</subject><subject>Coarse Grained Reconfigurable Array (CGRA)</subject><subject>Edge acceleration</subject><subject>Energy consumption</subject><subject>Flexible printed circuits</subject><subject>Limiting</subject><subject>Memory management</subject><subject>Performance evaluation</subject><subject>Throughput</subject><subject>Vector dataflow</subject><issn>1558-2434</issn><isbn>9798350322255</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2023</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNo1UF9LwzAcjILgnPsGPuQLtP7yv_GtdN0cFESd4ttIm1QjXVvajOm3t0OFg-MO7jgOIUwgJgT07SbL0qVQTENMgbKYAKNMEnqGFlrphIlJUyrEOZoRIZKIcsYv0dU4fgJMgUTO0OOqyN_u8KYNQ2cPlW_f8cnxZeNw_uWqQ_Bdiydk66cUH334wM-9mcxo6_Z9N5gGv7oqdANemmDqpjteo4vaNKNb_PEcvazybXYfFQ_rTZYWkafAQ1RzQRJwCYcKbFmR-jRWKSJBG8FJbaV2SinLrWPlpIwyZanB1iBpopVkc3Tz2-udc7t-8HszfO_-L2A_e65Pmw</recordid><startdate>20231028</startdate><enddate>20231028</enddate><creator>Bandara, Thilini Kaushalya</creator><creator>Wu, Dan</creator><creator>Juneja, Rohan</creator><creator>Wijerathne, Dhananjaya</creator><creator>Mitra, Tulika</creator><creator>Peh, Li-Shiuan</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope></search><sort><creationdate>20231028</creationdate><title>FLEX: Introducing FLEXible Execution on CGRA with Spatio-Temporal Vector Dataflow</title><author>Bandara, Thilini Kaushalya ; Wu, Dan ; Juneja, Rohan ; Wijerathne, Dhananjaya ; Mitra, Tulika ; Peh, Li-Shiuan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i204t-f45180e840c0dbc1f5032771609a541fd69e777d4de3bfd6a7abb90df06289763</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Adaptive arrays</topic><topic>Coarse Grained Reconfigurable Array (CGRA)</topic><topic>Edge acceleration</topic><topic>Energy consumption</topic><topic>Flexible printed circuits</topic><topic>Limiting</topic><topic>Memory management</topic><topic>Performance evaluation</topic><topic>Throughput</topic><topic>Vector dataflow</topic><toplevel>online_resources</toplevel><creatorcontrib>Bandara, Thilini Kaushalya</creatorcontrib><creatorcontrib>Wu, Dan</creatorcontrib><creatorcontrib>Juneja, Rohan</creatorcontrib><creatorcontrib>Wijerathne, Dhananjaya</creatorcontrib><creatorcontrib>Mitra, Tulika</creatorcontrib><creatorcontrib>Peh, Li-Shiuan</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Bandara, Thilini Kaushalya</au><au>Wu, Dan</au><au>Juneja, Rohan</au><au>Wijerathne, Dhananjaya</au><au>Mitra, Tulika</au><au>Peh, Li-Shiuan</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>FLEX: Introducing FLEXible Execution on CGRA with Spatio-Temporal Vector Dataflow</atitle><btitle>2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD)</btitle><stitle>ICCAD</stitle><date>2023-10-28</date><risdate>2023</risdate><spage>1</spage><epage>9</epage><pages>1-9</pages><eissn>1558-2434</eissn><eisbn>9798350322255</eisbn><abstract>Coarse-Grained Reconfigurable Arrays (CGRAs) are well-suited to resource-constrained edge devices due to their optimal combination of performance, energy efficiency, and adaptability. However, CGRAs typically follow a rigid execution model - either spatio-temporal or spatial - irrespective of the workload, limiting their efficiency. Spatio-temporal execution requires per-cycle reconfiguration, resulting in higher energy consumption. Conversely, spatial execution maintains the same configuration over a longer period; but this fixed mapping constraint can hinder the performance of complex applications and increase data memory accesses, leading to higher energy consumption. We introduce FLEX, a CGRA with a novel, flexible spatio-temporal vector dataflow execution model. This model processes a vector of data sequentially and chains them spatio-temporally. FLEX also supports variable vector lengths determined at compile time, enabling a more flexible execution paradigm. Our execution model reduces the reconfiguration frequency inherent in purely spatio-temporal mapping and mitigates the performance limitations and extra data memory accesses associated with purely spatial mapping. FLEX matches the performance of spatio-temporal CGRA but with 45% less energy and a 1.9 ×power efficiency improvement. Moreover, compared to a baseline spatial CGRA, FLEX consumes 35% less energy and delivers a 1.6× improvement in power efficiency at 1.5× higher throughput.</abstract><pub>IEEE</pub><doi>10.1109/ICCAD57390.2023.10323612</doi><tpages>9</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | EISSN: 1558-2434 |
ispartof | 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD), 2023, p.1-9 |
issn | 1558-2434 |
language | eng |
recordid | cdi_ieee_primary_10323612 |
source | IEEE Xplore All Conference Series |
subjects | Adaptive arrays Coarse Grained Reconfigurable Array (CGRA) Edge acceleration Energy consumption Flexible printed circuits Limiting Memory management Performance evaluation Throughput Vector dataflow |
title | FLEX: Introducing FLEXible Execution on CGRA with Spatio-Temporal Vector Dataflow |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T21%3A05%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=FLEX:%20Introducing%20FLEXible%20Execution%20on%20CGRA%20with%20Spatio-Temporal%20Vector%20Dataflow&rft.btitle=2023%20IEEE/ACM%20International%20Conference%20on%20Computer%20Aided%20Design%20(ICCAD)&rft.au=Bandara,%20Thilini%20Kaushalya&rft.date=2023-10-28&rft.spage=1&rft.epage=9&rft.pages=1-9&rft.eissn=1558-2434&rft_id=info:doi/10.1109/ICCAD57390.2023.10323612&rft.eisbn=9798350322255&rft_dat=%3Cieee_CHZPO%3E10323612%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i204t-f45180e840c0dbc1f5032771609a541fd69e777d4de3bfd6a7abb90df06289763%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10323612&rfr_iscdi=true |