Loading…

Automatic Generation and Optimization Framework of NoC-Based Neural Network Accelerator Through Reinforcement Learning

Choices of dataflows, which are known as intra-core neural network (NN) computation loop nest scheduling and inter-core hardware mapping strategies, play a critical role in the performance and energy efficiency of NoC-based neural network accelerators. Confronted with an enormous dataflow exploratio...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on computers 2024-12, Vol.73 (12), p.2882-2896
Main Authors:	Xue, Yongqi, Ji, Jinlun, Yu, Xinming, Zhou, Shize, Li, Siyue, Li, Xinyi, Cheng, Tong, Li, Shiping, Chen, Kai, Lu, Zhonghai, Li, Li, Fu, Yuxiang
Format:	Article
Language:	English
Subjects:	Artificial neural networks Biological neural networks Computer architecture Hardware hardware mapping Network-on-chip Neural networks Reinforcement learning Task analysis
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Choices of dataflows, which are known as intra-core neural network (NN) computation loop nest scheduling and inter-core hardware mapping strategies, play a critical role in the performance and energy efficiency of NoC-based neural network accelerators. Confronted with an enormous dataflow exploration space, this paper proposes an automatic framework for generating and optimizing the full-layer-mappings based on two reinforcement learning algorithms including A2C and PPO. Combining soft and hard constraints, this work transforms the mapping configuration into a sequential decision problem and aims to explore the performance and energy efficient hardware mapping for NoC systems. We evaluate the performance of the proposed framework on 10 experimental neural networks. The results show that compared with the direct-X mapping, the direct-Y mapping, GA-base mapping, and NN-aware mapping, our optimization framework reduces the average execution time of 10 experimental NNs by 9.09\% % , improves the throughput by 11.27\% % , reduces the energy by 12.62\% % , and reduces the time-energy-product (TEP) by 14.49\% % . The results also show that the performance enhancement is related to the coefficient of variation of the neural network to be computed.
ISSN:	0018-9340 1557-9956
DOI:	10.1109/TC.2024.3441822