Loading…

Reinforcement Learning-Driven Bit-Width Optimization for the High-Level Synthesis of Transformer Designs on Field-Programmable Gate Arrays

With the rapid development of deep-learning models, especially the widespread adoption of transformer architectures, the demand for efficient hardware accelerators with field-programmable gate arrays (FPGAs) has increased owing to their flexibility and performance advantages. Although high-level syn...

Full description

Saved in:

Bibliographic Details
Published in:	Electronics (Basel) 2024-02, Vol.13 (3), p.552
Main Authors:	Jang, Seojin, Cho, Yongbeom
Format:	Article
Language:	English
Subjects:	Accelerators Accuracy Algorithms Circuit design Deep learning Design Digital integrated circuits Efficiency Fashion models Field programmable gate arrays Hardware High level synthesis Language Machine learning Methods Natural language processing Neural networks Optimization Optimization techniques Power Resource utilization
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	With the rapid development of deep-learning models, especially the widespread adoption of transformer architectures, the demand for efficient hardware accelerators with field-programmable gate arrays (FPGAs) has increased owing to their flexibility and performance advantages. Although high-level synthesis can shorten the hardware design cycle, determining the optimal bit-width for various transformer designs remains challenging. Therefore, this paper proposes a novel technique based on a predesigned transformer hardware architecture tailored for various types of FPGAs. The proposed method leverages a reinforcement learning-driven mechanism to automatically adapt and optimize bit-width settings based on user-provided transformer variants during inference on an FPGA, significantly alleviating the challenges related to bit-width optimization. The effect of bit-width settings on resource utilization and performance across different FPGA types was analyzed. The efficacy of the proposed method was demonstrated by optimizing the bit-width settings for users’ transformer-based model inferences on an FPGA. The use of the predesigned hardware architecture significantly enhanced the performance. Overall, the proposed method enables effective and optimized implementations of user-provided transformer-based models on an FPGA, paving the way for edge FPGA-based deep-learning accelerators while reducing the time and effort typically required in fine-tuning bit-width settings.
ISSN:	2079-9292 2079-9292
DOI:	10.3390/electronics13030552