Loading…

Automatic Generation and Optimization Framework of NoC-Based Neural Network Accelerator Through Reinforcement Learning

Choices of dataflows, which are known as intra-core neural network (NN) computation loop nest scheduling and inter-core hardware mapping strategies, play a critical role in the performance and energy efficiency of NoC-based neural network accelerators. Confronted with an enormous dataflow exploratio...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on computers 2024-12, Vol.73 (12), p.2882-2896
Main Authors:	Xue, Yongqi, Ji, Jinlun, Yu, Xinming, Zhou, Shize, Li, Siyue, Li, Xinyi, Cheng, Tong, Li, Shiping, Chen, Kai, Lu, Zhonghai, Li, Li, Fu, Yuxiang
Format:	Article
Language:	English
Subjects:	Artificial neural networks Biological neural networks Computer architecture Hardware hardware mapping Network-on-chip Neural networks Reinforcement learning Task analysis
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites	cdi_FETCH-LOGICAL-c146t-7e756f689abe6920e0fe982190a05d5f9a4baa08cb296bde68a2ce028aa4a9c23
container_end_page	2896
container_issue	12
container_start_page	2882
container_title	IEEE transactions on computers
container_volume	73
creator	Xue, Yongqi Ji, Jinlun Yu, Xinming Zhou, Shize Li, Siyue Li, Xinyi Cheng, Tong Li, Shiping Chen, Kai Lu, Zhonghai Li, Li Fu, Yuxiang
description	Choices of dataflows, which are known as intra-core neural network (NN) computation loop nest scheduling and inter-core hardware mapping strategies, play a critical role in the performance and energy efficiency of NoC-based neural network accelerators. Confronted with an enormous dataflow exploration space, this paper proposes an automatic framework for generating and optimizing the full-layer-mappings based on two reinforcement learning algorithms including A2C and PPO. Combining soft and hard constraints, this work transforms the mapping configuration into a sequential decision problem and aims to explore the performance and energy efficient hardware mapping for NoC systems. We evaluate the performance of the proposed framework on 10 experimental neural networks. The results show that compared with the direct-X mapping, the direct-Y mapping, GA-base mapping, and NN-aware mapping, our optimization framework reduces the average execution time of 10 experimental NNs by 9.09\% % , improves the throughput by 11.27\% % , reduces the energy by 12.62\% % , and reduces the time-energy-product (TEP) by 14.49\% % . The results also show that the performance enhancement is related to the coefficient of variation of the neural network to be computed.
doi_str_mv	10.1109/TC.2024.3441822
format	article
fullrecord	<record><control><sourceid>crossref_ieee_</sourceid><recordid>TN_cdi_ieee_primary_10633899</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10633899</ieee_id><sourcerecordid>10_1109_TC_2024_3441822</sourcerecordid><originalsourceid>FETCH-LOGICAL-c146t-7e756f689abe6920e0fe982190a05d5f9a4baa08cb296bde68a2ce028aa4a9c23</originalsourceid><addsrcrecordid>eNpNkEFPwkAQhTdGExE9e_Gwf6Awu22X3SM2giYEElPPzXQ7hVXaJdui0V8vCAdP7-XNvHf4GLsXMBICzDjPRhJkMoqTRGgpL9hApOkkMiZVl2wAIHRk4gSu2U3XvQOAkmAG7HO6732DvbN8Ti2Fg_Mtx7biq13vGvdzCmYBG_ry4YP7mi99Fj1iRxVf0j7g9iD9321qLW2PGz7wfBP8fr3hr-Ta2gdLDbU9XxCG1rXrW3ZV47aju7MO2dvsKc-eo8Vq_pJNF5EVieqjCU1SVSttsCRlJBDUZLQUBhDSKq0NJiUiaFtKo8qKlEZpCaRGTNBYGQ_Z-LRrg--6QHWxC67B8F0IKI7YijwrjtiKM7ZD4-HUcET071vFsTYm_gXY9Gul</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Automatic Generation and Optimization Framework of NoC-Based Neural Network Accelerator Through Reinforcement Learning</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Xue, Yongqi ; Ji, Jinlun ; Yu, Xinming ; Zhou, Shize ; Li, Siyue ; Li, Xinyi ; Cheng, Tong ; Li, Shiping ; Chen, Kai ; Lu, Zhonghai ; Li, Li ; Fu, Yuxiang</creator><creatorcontrib>Xue, Yongqi ; Ji, Jinlun ; Yu, Xinming ; Zhou, Shize ; Li, Siyue ; Li, Xinyi ; Cheng, Tong ; Li, Shiping ; Chen, Kai ; Lu, Zhonghai ; Li, Li ; Fu, Yuxiang</creatorcontrib><description><![CDATA[Choices of dataflows, which are known as intra-core neural network (NN) computation loop nest scheduling and inter-core hardware mapping strategies, play a critical role in the performance and energy efficiency of NoC-based neural network accelerators. Confronted with an enormous dataflow exploration space, this paper proposes an automatic framework for generating and optimizing the full-layer-mappings based on two reinforcement learning algorithms including A2C and PPO. Combining soft and hard constraints, this work transforms the mapping configuration into a sequential decision problem and aims to explore the performance and energy efficient hardware mapping for NoC systems. We evaluate the performance of the proposed framework on 10 experimental neural networks. The results show that compared with the direct-X mapping, the direct-Y mapping, GA-base mapping, and NN-aware mapping, our optimization framework reduces the average execution time of 10 experimental NNs by 9.09<inline-formula><tex-math notation="LaTeX">\%</tex-math> <mml:math><mml:mi mathvariant="normal">%</mml:mi></mml:math><inline-graphic xlink:href="fu-ieq1-3441822.gif"/> </inline-formula>, improves the throughput by 11.27<inline-formula><tex-math notation="LaTeX">\%</tex-math> <mml:math><mml:mi mathvariant="normal">%</mml:mi></mml:math><inline-graphic xlink:href="fu-ieq2-3441822.gif"/> </inline-formula>, reduces the energy by 12.62<inline-formula><tex-math notation="LaTeX">\%</tex-math> <mml:math><mml:mi mathvariant="normal">%</mml:mi></mml:math><inline-graphic xlink:href="fu-ieq3-3441822.gif"/> </inline-formula>, and reduces the time-energy-product (TEP) by 14.49<inline-formula><tex-math notation="LaTeX">\%</tex-math> <mml:math><mml:mi mathvariant="normal">%</mml:mi></mml:math><inline-graphic xlink:href="fu-ieq4-3441822.gif"/> </inline-formula>. The results also show that the performance enhancement is related to the coefficient of variation of the neural network to be computed.]]></description><identifier>ISSN: 0018-9340</identifier><identifier>EISSN: 1557-9956</identifier><identifier>DOI: 10.1109/TC.2024.3441822</identifier><identifier>CODEN: ITCOB4</identifier><language>eng</language><publisher>IEEE</publisher><subject>Artificial neural networks ; Biological neural networks ; Computer architecture ; Hardware ; hardware mapping ; Network-on-chip ; Neural networks ; Reinforcement learning ; Task analysis</subject><ispartof>IEEE transactions on computers, 2024-12, Vol.73 (12), p.2882-2896</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c146t-7e756f689abe6920e0fe982190a05d5f9a4baa08cb296bde68a2ce028aa4a9c23</cites><orcidid>0000-0003-1351-5460 ; 0009-0007-2099-040X ; 0009-0007-5630-1514 ; 0009-0009-5260-5726 ; 0009-0003-2877-2387 ; 0000-0002-1047-6067 ; 0009-0007-9970-804X ; 0009-0004-4649-041X ; 0000-0001-8241-7536 ; 0009-0003-0152-9571 ; 0000-0003-0061-3475 ; 0009-0007-9023-0465</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10633899$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,54771</link.rule.ids></links><search><creatorcontrib>Xue, Yongqi</creatorcontrib><creatorcontrib>Ji, Jinlun</creatorcontrib><creatorcontrib>Yu, Xinming</creatorcontrib><creatorcontrib>Zhou, Shize</creatorcontrib><creatorcontrib>Li, Siyue</creatorcontrib><creatorcontrib>Li, Xinyi</creatorcontrib><creatorcontrib>Cheng, Tong</creatorcontrib><creatorcontrib>Li, Shiping</creatorcontrib><creatorcontrib>Chen, Kai</creatorcontrib><creatorcontrib>Lu, Zhonghai</creatorcontrib><creatorcontrib>Li, Li</creatorcontrib><creatorcontrib>Fu, Yuxiang</creatorcontrib><title>Automatic Generation and Optimization Framework of NoC-Based Neural Network Accelerator Through Reinforcement Learning</title><title>IEEE transactions on computers</title><addtitle>TC</addtitle><description><![CDATA[Choices of dataflows, which are known as intra-core neural network (NN) computation loop nest scheduling and inter-core hardware mapping strategies, play a critical role in the performance and energy efficiency of NoC-based neural network accelerators. Confronted with an enormous dataflow exploration space, this paper proposes an automatic framework for generating and optimizing the full-layer-mappings based on two reinforcement learning algorithms including A2C and PPO. Combining soft and hard constraints, this work transforms the mapping configuration into a sequential decision problem and aims to explore the performance and energy efficient hardware mapping for NoC systems. We evaluate the performance of the proposed framework on 10 experimental neural networks. The results show that compared with the direct-X mapping, the direct-Y mapping, GA-base mapping, and NN-aware mapping, our optimization framework reduces the average execution time of 10 experimental NNs by 9.09<inline-formula><tex-math notation="LaTeX">\%</tex-math> <mml:math><mml:mi mathvariant="normal">%</mml:mi></mml:math><inline-graphic xlink:href="fu-ieq1-3441822.gif"/> </inline-formula>, improves the throughput by 11.27<inline-formula><tex-math notation="LaTeX">\%</tex-math> <mml:math><mml:mi mathvariant="normal">%</mml:mi></mml:math><inline-graphic xlink:href="fu-ieq2-3441822.gif"/> </inline-formula>, reduces the energy by 12.62<inline-formula><tex-math notation="LaTeX">\%</tex-math> <mml:math><mml:mi mathvariant="normal">%</mml:mi></mml:math><inline-graphic xlink:href="fu-ieq3-3441822.gif"/> </inline-formula>, and reduces the time-energy-product (TEP) by 14.49<inline-formula><tex-math notation="LaTeX">\%</tex-math> <mml:math><mml:mi mathvariant="normal">%</mml:mi></mml:math><inline-graphic xlink:href="fu-ieq4-3441822.gif"/> </inline-formula>. The results also show that the performance enhancement is related to the coefficient of variation of the neural network to be computed.]]></description><subject>Artificial neural networks</subject><subject>Biological neural networks</subject><subject>Computer architecture</subject><subject>Hardware</subject><subject>hardware mapping</subject><subject>Network-on-chip</subject><subject>Neural networks</subject><subject>Reinforcement learning</subject><subject>Task analysis</subject><issn>0018-9340</issn><issn>1557-9956</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNpNkEFPwkAQhTdGExE9e_Gwf6Awu22X3SM2giYEElPPzXQ7hVXaJdui0V8vCAdP7-XNvHf4GLsXMBICzDjPRhJkMoqTRGgpL9hApOkkMiZVl2wAIHRk4gSu2U3XvQOAkmAG7HO6732DvbN8Ti2Fg_Mtx7biq13vGvdzCmYBG_ry4YP7mi99Fj1iRxVf0j7g9iD9321qLW2PGz7wfBP8fr3hr-Ta2gdLDbU9XxCG1rXrW3ZV47aju7MO2dvsKc-eo8Vq_pJNF5EVieqjCU1SVSttsCRlJBDUZLQUBhDSKq0NJiUiaFtKo8qKlEZpCaRGTNBYGQ_Z-LRrg--6QHWxC67B8F0IKI7YijwrjtiKM7ZD4-HUcET071vFsTYm_gXY9Gul</recordid><startdate>202412</startdate><enddate>202412</enddate><creator>Xue, Yongqi</creator><creator>Ji, Jinlun</creator><creator>Yu, Xinming</creator><creator>Zhou, Shize</creator><creator>Li, Siyue</creator><creator>Li, Xinyi</creator><creator>Cheng, Tong</creator><creator>Li, Shiping</creator><creator>Chen, Kai</creator><creator>Lu, Zhonghai</creator><creator>Li, Li</creator><creator>Fu, Yuxiang</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0003-1351-5460</orcidid><orcidid>https://orcid.org/0009-0007-2099-040X</orcidid><orcidid>https://orcid.org/0009-0007-5630-1514</orcidid><orcidid>https://orcid.org/0009-0009-5260-5726</orcidid><orcidid>https://orcid.org/0009-0003-2877-2387</orcidid><orcidid>https://orcid.org/0000-0002-1047-6067</orcidid><orcidid>https://orcid.org/0009-0007-9970-804X</orcidid><orcidid>https://orcid.org/0009-0004-4649-041X</orcidid><orcidid>https://orcid.org/0000-0001-8241-7536</orcidid><orcidid>https://orcid.org/0009-0003-0152-9571</orcidid><orcidid>https://orcid.org/0000-0003-0061-3475</orcidid><orcidid>https://orcid.org/0009-0007-9023-0465</orcidid></search><sort><creationdate>202412</creationdate><title>Automatic Generation and Optimization Framework of NoC-Based Neural Network Accelerator Through Reinforcement Learning</title><author>Xue, Yongqi ; Ji, Jinlun ; Yu, Xinming ; Zhou, Shize ; Li, Siyue ; Li, Xinyi ; Cheng, Tong ; Li, Shiping ; Chen, Kai ; Lu, Zhonghai ; Li, Li ; Fu, Yuxiang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c146t-7e756f689abe6920e0fe982190a05d5f9a4baa08cb296bde68a2ce028aa4a9c23</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Artificial neural networks</topic><topic>Biological neural networks</topic><topic>Computer architecture</topic><topic>Hardware</topic><topic>hardware mapping</topic><topic>Network-on-chip</topic><topic>Neural networks</topic><topic>Reinforcement learning</topic><topic>Task analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Xue, Yongqi</creatorcontrib><creatorcontrib>Ji, Jinlun</creatorcontrib><creatorcontrib>Yu, Xinming</creatorcontrib><creatorcontrib>Zhou, Shize</creatorcontrib><creatorcontrib>Li, Siyue</creatorcontrib><creatorcontrib>Li, Xinyi</creatorcontrib><creatorcontrib>Cheng, Tong</creatorcontrib><creatorcontrib>Li, Shiping</creatorcontrib><creatorcontrib>Chen, Kai</creatorcontrib><creatorcontrib>Lu, Zhonghai</creatorcontrib><creatorcontrib>Li, Li</creatorcontrib><creatorcontrib>Fu, Yuxiang</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005–Present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Xplore</collection><collection>CrossRef</collection><jtitle>IEEE transactions on computers</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Xue, Yongqi</au><au>Ji, Jinlun</au><au>Yu, Xinming</au><au>Zhou, Shize</au><au>Li, Siyue</au><au>Li, Xinyi</au><au>Cheng, Tong</au><au>Li, Shiping</au><au>Chen, Kai</au><au>Lu, Zhonghai</au><au>Li, Li</au><au>Fu, Yuxiang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Automatic Generation and Optimization Framework of NoC-Based Neural Network Accelerator Through Reinforcement Learning</atitle><jtitle>IEEE transactions on computers</jtitle><stitle>TC</stitle><date>2024-12</date><risdate>2024</risdate><volume>73</volume><issue>12</issue><spage>2882</spage><epage>2896</epage><pages>2882-2896</pages><issn>0018-9340</issn><eissn>1557-9956</eissn><coden>ITCOB4</coden><abstract><![CDATA[Choices of dataflows, which are known as intra-core neural network (NN) computation loop nest scheduling and inter-core hardware mapping strategies, play a critical role in the performance and energy efficiency of NoC-based neural network accelerators. Confronted with an enormous dataflow exploration space, this paper proposes an automatic framework for generating and optimizing the full-layer-mappings based on two reinforcement learning algorithms including A2C and PPO. Combining soft and hard constraints, this work transforms the mapping configuration into a sequential decision problem and aims to explore the performance and energy efficient hardware mapping for NoC systems. We evaluate the performance of the proposed framework on 10 experimental neural networks. The results show that compared with the direct-X mapping, the direct-Y mapping, GA-base mapping, and NN-aware mapping, our optimization framework reduces the average execution time of 10 experimental NNs by 9.09<inline-formula><tex-math notation="LaTeX">\%</tex-math> <mml:math><mml:mi mathvariant="normal">%</mml:mi></mml:math><inline-graphic xlink:href="fu-ieq1-3441822.gif"/> </inline-formula>, improves the throughput by 11.27<inline-formula><tex-math notation="LaTeX">\%</tex-math> <mml:math><mml:mi mathvariant="normal">%</mml:mi></mml:math><inline-graphic xlink:href="fu-ieq2-3441822.gif"/> </inline-formula>, reduces the energy by 12.62<inline-formula><tex-math notation="LaTeX">\%</tex-math> <mml:math><mml:mi mathvariant="normal">%</mml:mi></mml:math><inline-graphic xlink:href="fu-ieq3-3441822.gif"/> </inline-formula>, and reduces the time-energy-product (TEP) by 14.49<inline-formula><tex-math notation="LaTeX">\%</tex-math> <mml:math><mml:mi mathvariant="normal">%</mml:mi></mml:math><inline-graphic xlink:href="fu-ieq4-3441822.gif"/> </inline-formula>. The results also show that the performance enhancement is related to the coefficient of variation of the neural network to be computed.]]></abstract><pub>IEEE</pub><doi>10.1109/TC.2024.3441822</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0003-1351-5460</orcidid><orcidid>https://orcid.org/0009-0007-2099-040X</orcidid><orcidid>https://orcid.org/0009-0007-5630-1514</orcidid><orcidid>https://orcid.org/0009-0009-5260-5726</orcidid><orcidid>https://orcid.org/0009-0003-2877-2387</orcidid><orcidid>https://orcid.org/0000-0002-1047-6067</orcidid><orcidid>https://orcid.org/0009-0007-9970-804X</orcidid><orcidid>https://orcid.org/0009-0004-4649-041X</orcidid><orcidid>https://orcid.org/0000-0001-8241-7536</orcidid><orcidid>https://orcid.org/0009-0003-0152-9571</orcidid><orcidid>https://orcid.org/0000-0003-0061-3475</orcidid><orcidid>https://orcid.org/0009-0007-9023-0465</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 0018-9340
ispartof	IEEE transactions on computers, 2024-12, Vol.73 (12), p.2882-2896
issn	0018-9340 1557-9956
language	eng
recordid	cdi_ieee_primary_10633899
source	IEEE Electronic Library (IEL) Journals
subjects	Artificial neural networks Biological neural networks Computer architecture Hardware hardware mapping Network-on-chip Neural networks Reinforcement learning Task analysis
title	Automatic Generation and Optimization Framework of NoC-Based Neural Network Accelerator Through Reinforcement Learning
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-03T14%3A44%3A11IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Automatic%20Generation%20and%20Optimization%20Framework%20of%20NoC-Based%20Neural%20Network%20Accelerator%20Through%20Reinforcement%20Learning&rft.jtitle=IEEE%20transactions%20on%20computers&rft.au=Xue,%20Yongqi&rft.date=2024-12&rft.volume=73&rft.issue=12&rft.spage=2882&rft.epage=2896&rft.pages=2882-2896&rft.issn=0018-9340&rft.eissn=1557-9956&rft.coden=ITCOB4&rft_id=info:doi/10.1109/TC.2024.3441822&rft_dat=%3Ccrossref_ieee_%3E10_1109_TC_2024_3441822%3C/crossref_ieee_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c146t-7e756f689abe6920e0fe982190a05d5f9a4baa08cb296bde68a2ce028aa4a9c23%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10633899&rfr_iscdi=true