Loading…
FLEX: Introducing FLEXible Execution on CGRA with Spatio-Temporal Vector Dataflow
Coarse-Grained Reconfigurable Arrays (CGRAs) are well-suited to resource-constrained edge devices due to their optimal combination of performance, energy efficiency, and adaptability. However, CGRAs typically follow a rigid execution model - either spatio-temporal or spatial - irrespective of the wo...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Coarse-Grained Reconfigurable Arrays (CGRAs) are well-suited to resource-constrained edge devices due to their optimal combination of performance, energy efficiency, and adaptability. However, CGRAs typically follow a rigid execution model - either spatio-temporal or spatial - irrespective of the workload, limiting their efficiency. Spatio-temporal execution requires per-cycle reconfiguration, resulting in higher energy consumption. Conversely, spatial execution maintains the same configuration over a longer period; but this fixed mapping constraint can hinder the performance of complex applications and increase data memory accesses, leading to higher energy consumption. We introduce FLEX, a CGRA with a novel, flexible spatio-temporal vector dataflow execution model. This model processes a vector of data sequentially and chains them spatio-temporally. FLEX also supports variable vector lengths determined at compile time, enabling a more flexible execution paradigm. Our execution model reduces the reconfiguration frequency inherent in purely spatio-temporal mapping and mitigates the performance limitations and extra data memory accesses associated with purely spatial mapping. FLEX matches the performance of spatio-temporal CGRA but with 45% less energy and a 1.9 ×power efficiency improvement. Moreover, compared to a baseline spatial CGRA, FLEX consumes 35% less energy and delivers a 1.6× improvement in power efficiency at 1.5× higher throughput. |
---|---|
ISSN: | 1558-2434 |
DOI: | 10.1109/ICCAD57390.2023.10323612 |