Loading…

TSAR-ILP: Tile-based, Synchronization-AwaRe ILP Allocating Heterogeneous Platforms for Streaming Applications

Automatic design space exploration (DSE) is key in hardware-software (HW/SW) co-design. To cope with the large design space, explorations are often heuristic-based and/or approximate yielding potentially locally optimal solutions. Without knowing the globally optimal solution, strong assertions abou...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on computer-aided design of integrated circuits and systems 2023-11, Vol.42 (11), p.1-1
Main Authors: Morais, Bruno, Zhang, Jinghan, Schirner, Gunar
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Automatic design space exploration (DSE) is key in hardware-software (HW/SW) co-design. To cope with the large design space, explorations are often heuristic-based and/or approximate yielding potentially locally optimal solutions. Without knowing the globally optimal solution, strong assertions about performance upper / lower bounds cannot be made. In contrast, integer linear programming (ILP) formulations can produce exact (optimal) solutions. Previous ILP-based formulations, however, lack support for tile-based architectures and realistic synchronization models, limiting their DSE capabilities. This work introduces a tile-based, synchronization-aware ILP (TSAR-ILP) formulation that overcomes previous limitations. With TSAR-ILP, the allocation / binding problems are introduced and formalized, attaining optimal solutions for mapping streaming applications onto template platforms. Using TSAR-ILP, this work explores a hardware acceleratorrich (HWACC-rich) platform with direct HWACC-to-HWACC communication under HW area constraints for 40 OpenVX applications. To illustrate design opportunities given by (a) the ILP formulation and (b) direct HWACC-to-HWACC communication, this paper analyzes the impact of job size. Results show that selecting smaller job sizes yields performance improvements and less area usage at the cost of slightly increased synchronization overhead. A job size reduction from 1 kB to 256 bytes gives 3.51x average performance increase across 40 applications. Finally, DSE with TSAR-ILP is shown not to be prohibitive through scalability analysis using a set of 5000 synthetic applications with varying size (10-125 nodes), with 94.3% of applications successfully achieving optimal solutions under 60 seconds.
ISSN:0278-0070
1937-4151
DOI:10.1109/TCAD.2023.3274050