Loading…

Automatic Generation and Optimization Framework of NoC-Based Neural Network Accelerator Through Reinforcement Learning

Choices of dataflows, which are known as intra-core neural network (NN) computation loop nest scheduling and inter-core hardware mapping strategies, play a critical role in the performance and energy efficiency of NoC-based neural network accelerators. Confronted with an enormous dataflow exploratio...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on computers 2024-12, Vol.73 (12), p.2882-2896
Main Authors: Xue, Yongqi, Ji, Jinlun, Yu, Xinming, Zhou, Shize, Li, Siyue, Li, Xinyi, Cheng, Tong, Li, Shiping, Chen, Kai, Lu, Zhonghai, Li, Li, Fu, Yuxiang
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites cdi_FETCH-LOGICAL-c146t-7e756f689abe6920e0fe982190a05d5f9a4baa08cb296bde68a2ce028aa4a9c23
container_end_page 2896
container_issue 12
container_start_page 2882
container_title IEEE transactions on computers
container_volume 73
creator Xue, Yongqi
Ji, Jinlun
Yu, Xinming
Zhou, Shize
Li, Siyue
Li, Xinyi
Cheng, Tong
Li, Shiping
Chen, Kai
Lu, Zhonghai
Li, Li
Fu, Yuxiang
description Choices of dataflows, which are known as intra-core neural network (NN) computation loop nest scheduling and inter-core hardware mapping strategies, play a critical role in the performance and energy efficiency of NoC-based neural network accelerators. Confronted with an enormous dataflow exploration space, this paper proposes an automatic framework for generating and optimizing the full-layer-mappings based on two reinforcement learning algorithms including A2C and PPO. Combining soft and hard constraints, this work transforms the mapping configuration into a sequential decision problem and aims to explore the performance and energy efficient hardware mapping for NoC systems. We evaluate the performance of the proposed framework on 10 experimental neural networks. The results show that compared with the direct-X mapping, the direct-Y mapping, GA-base mapping, and NN-aware mapping, our optimization framework reduces the average execution time of 10 experimental NNs by 9.09\% % , improves the throughput by 11.27\% % , reduces the energy by 12.62\% % , and reduces the time-energy-product (TEP) by 14.49\% % . The results also show that the performance enhancement is related to the coefficient of variation of the neural network to be computed.
doi_str_mv 10.1109/TC.2024.3441822
format article
fullrecord <record><control><sourceid>crossref_ieee_</sourceid><recordid>TN_cdi_ieee_primary_10633899</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10633899</ieee_id><sourcerecordid>10_1109_TC_2024_3441822</sourcerecordid><originalsourceid>FETCH-LOGICAL-c146t-7e756f689abe6920e0fe982190a05d5f9a4baa08cb296bde68a2ce028aa4a9c23</originalsourceid><addsrcrecordid>eNpNkEFPwkAQhTdGExE9e_Gwf6Awu22X3SM2giYEElPPzXQ7hVXaJdui0V8vCAdP7-XNvHf4GLsXMBICzDjPRhJkMoqTRGgpL9hApOkkMiZVl2wAIHRk4gSu2U3XvQOAkmAG7HO6732DvbN8Ti2Fg_Mtx7biq13vGvdzCmYBG_ry4YP7mi99Fj1iRxVf0j7g9iD9321qLW2PGz7wfBP8fr3hr-Ta2gdLDbU9XxCG1rXrW3ZV47aju7MO2dvsKc-eo8Vq_pJNF5EVieqjCU1SVSttsCRlJBDUZLQUBhDSKq0NJiUiaFtKo8qKlEZpCaRGTNBYGQ_Z-LRrg--6QHWxC67B8F0IKI7YijwrjtiKM7ZD4-HUcET071vFsTYm_gXY9Gul</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Automatic Generation and Optimization Framework of NoC-Based Neural Network Accelerator Through Reinforcement Learning</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Xue, Yongqi ; Ji, Jinlun ; Yu, Xinming ; Zhou, Shize ; Li, Siyue ; Li, Xinyi ; Cheng, Tong ; Li, Shiping ; Chen, Kai ; Lu, Zhonghai ; Li, Li ; Fu, Yuxiang</creator><creatorcontrib>Xue, Yongqi ; Ji, Jinlun ; Yu, Xinming ; Zhou, Shize ; Li, Siyue ; Li, Xinyi ; Cheng, Tong ; Li, Shiping ; Chen, Kai ; Lu, Zhonghai ; Li, Li ; Fu, Yuxiang</creatorcontrib><description><![CDATA[Choices of dataflows, which are known as intra-core neural network (NN) computation loop nest scheduling and inter-core hardware mapping strategies, play a critical role in the performance and energy efficiency of NoC-based neural network accelerators. Confronted with an enormous dataflow exploration space, this paper proposes an automatic framework for generating and optimizing the full-layer-mappings based on two reinforcement learning algorithms including A2C and PPO. Combining soft and hard constraints, this work transforms the mapping configuration into a sequential decision problem and aims to explore the performance and energy efficient hardware mapping for NoC systems. We evaluate the performance of the proposed framework on 10 experimental neural networks. The results show that compared with the direct-X mapping, the direct-Y mapping, GA-base mapping, and NN-aware mapping, our optimization framework reduces the average execution time of 10 experimental NNs by 9.09<inline-formula><tex-math notation="LaTeX">\%</tex-math> <mml:math><mml:mi mathvariant="normal">%</mml:mi></mml:math><inline-graphic xlink:href="fu-ieq1-3441822.gif"/> </inline-formula>, improves the throughput by 11.27<inline-formula><tex-math notation="LaTeX">\%</tex-math> <mml:math><mml:mi mathvariant="normal">%</mml:mi></mml:math><inline-graphic xlink:href="fu-ieq2-3441822.gif"/> </inline-formula>, reduces the energy by 12.62<inline-formula><tex-math notation="LaTeX">\%</tex-math> <mml:math><mml:mi mathvariant="normal">%</mml:mi></mml:math><inline-graphic xlink:href="fu-ieq3-3441822.gif"/> </inline-formula>, and reduces the time-energy-product (TEP) by 14.49<inline-formula><tex-math notation="LaTeX">\%</tex-math> <mml:math><mml:mi mathvariant="normal">%</mml:mi></mml:math><inline-graphic xlink:href="fu-ieq4-3441822.gif"/> </inline-formula>. The results also show that the performance enhancement is related to the coefficient of variation of the neural network to be computed.]]></description><identifier>ISSN: 0018-9340</identifier><identifier>EISSN: 1557-9956</identifier><identifier>DOI: 10.1109/TC.2024.3441822</identifier><identifier>CODEN: ITCOB4</identifier><language>eng</language><publisher>IEEE</publisher><subject>Artificial neural networks ; Biological neural networks ; Computer architecture ; Hardware ; hardware mapping ; Network-on-chip ; Neural networks ; Reinforcement learning ; Task analysis</subject><ispartof>IEEE transactions on computers, 2024-12, Vol.73 (12), p.2882-2896</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c146t-7e756f689abe6920e0fe982190a05d5f9a4baa08cb296bde68a2ce028aa4a9c23</cites><orcidid>0000-0003-1351-5460 ; 0009-0007-2099-040X ; 0009-0007-5630-1514 ; 0009-0009-5260-5726 ; 0009-0003-2877-2387 ; 0000-0002-1047-6067 ; 0009-0007-9970-804X ; 0009-0004-4649-041X ; 0000-0001-8241-7536 ; 0009-0003-0152-9571 ; 0000-0003-0061-3475 ; 0009-0007-9023-0465</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10633899$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,54771</link.rule.ids></links><search><creatorcontrib>Xue, Yongqi</creatorcontrib><creatorcontrib>Ji, Jinlun</creatorcontrib><creatorcontrib>Yu, Xinming</creatorcontrib><creatorcontrib>Zhou, Shize</creatorcontrib><creatorcontrib>Li, Siyue</creatorcontrib><creatorcontrib>Li, Xinyi</creatorcontrib><creatorcontrib>Cheng, Tong</creatorcontrib><creatorcontrib>Li, Shiping</creatorcontrib><creatorcontrib>Chen, Kai</creatorcontrib><creatorcontrib>Lu, Zhonghai</creatorcontrib><creatorcontrib>Li, Li</creatorcontrib><creatorcontrib>Fu, Yuxiang</creatorcontrib><title>Automatic Generation and Optimization Framework of NoC-Based Neural Network Accelerator Through Reinforcement Learning</title><title>IEEE transactions on computers</title><addtitle>TC</addtitle><description><![CDATA[Choices of dataflows, which are known as intra-core neural network (NN) computation loop nest scheduling and inter-core hardware mapping strategies, play a critical role in the performance and energy efficiency of NoC-based neural network accelerators. Confronted with an enormous dataflow exploration space, this paper proposes an automatic framework for generating and optimizing the full-layer-mappings based on two reinforcement learning algorithms including A2C and PPO. Combining soft and hard constraints, this work transforms the mapping configuration into a sequential decision problem and aims to explore the performance and energy efficient hardware mapping for NoC systems. We evaluate the performance of the proposed framework on 10 experimental neural networks. The results show that compared with the direct-X mapping, the direct-Y mapping, GA-base mapping, and NN-aware mapping, our optimization framework reduces the average execution time of 10 experimental NNs by 9.09<inline-formula><tex-math notation="LaTeX">\%</tex-math> <mml:math><mml:mi mathvariant="normal">%</mml:mi></mml:math><inline-graphic xlink:href="fu-ieq1-3441822.gif"/> </inline-formula>, improves the throughput by 11.27<inline-formula><tex-math notation="LaTeX">\%</tex-math> <mml:math><mml:mi mathvariant="normal">%</mml:mi></mml:math><inline-graphic xlink:href="fu-ieq2-3441822.gif"/> </inline-formula>, reduces the energy by 12.62<inline-formula><tex-math notation="LaTeX">\%</tex-math> <mml:math><mml:mi mathvariant="normal">%</mml:mi></mml:math><inline-graphic xlink:href="fu-ieq3-3441822.gif"/> </inline-formula>, and reduces the time-energy-product (TEP) by 14.49<inline-formula><tex-math notation="LaTeX">\%</tex-math> <mml:math><mml:mi mathvariant="normal">%</mml:mi></mml:math><inline-graphic xlink:href="fu-ieq4-3441822.gif"/> </inline-formula>. The results also show that the performance enhancement is related to the coefficient of variation of the neural network to be computed.]]></description><subject>Artificial neural networks</subject><subject>Biological neural networks</subject><subject>Computer architecture</subject><subject>Hardware</subject><subject>hardware mapping</subject><subject>Network-on-chip</subject><subject>Neural networks</subject><subject>Reinforcement learning</subject><subject>Task analysis</subject><issn>0018-9340</issn><issn>1557-9956</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNpNkEFPwkAQhTdGExE9e_Gwf6Awu22X3SM2giYEElPPzXQ7hVXaJdui0V8vCAdP7-XNvHf4GLsXMBICzDjPRhJkMoqTRGgpL9hApOkkMiZVl2wAIHRk4gSu2U3XvQOAkmAG7HO6732DvbN8Ti2Fg_Mtx7biq13vGvdzCmYBG_ry4YP7mi99Fj1iRxVf0j7g9iD9321qLW2PGz7wfBP8fr3hr-Ta2gdLDbU9XxCG1rXrW3ZV47aju7MO2dvsKc-eo8Vq_pJNF5EVieqjCU1SVSttsCRlJBDUZLQUBhDSKq0NJiUiaFtKo8qKlEZpCaRGTNBYGQ_Z-LRrg--6QHWxC67B8F0IKI7YijwrjtiKM7ZD4-HUcET071vFsTYm_gXY9Gul</recordid><startdate>202412</startdate><enddate>202412</enddate><creator>Xue, Yongqi</creator><creator>Ji, Jinlun</creator><creator>Yu, Xinming</creator><creator>Zhou, Shize</creator><creator>Li, Siyue</creator><creator>Li, Xinyi</creator><creator>Cheng, Tong</creator><creator>Li, Shiping</creator><creator>Chen, Kai</creator><creator>Lu, Zhonghai</creator><creator>Li, Li</creator><creator>Fu, Yuxiang</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0003-1351-5460</orcidid><orcidid>https://orcid.org/0009-0007-2099-040X</orcidid><orcidid>https://orcid.org/0009-0007-5630-1514</orcidid><orcidid>https://orcid.org/0009-0009-5260-5726</orcidid><orcidid>https://orcid.org/0009-0003-2877-2387</orcidid><orcidid>https://orcid.org/0000-0002-1047-6067</orcidid><orcidid>https://orcid.org/0009-0007-9970-804X</orcidid><orcidid>https://orcid.org/0009-0004-4649-041X</orcidid><orcidid>https://orcid.org/0000-0001-8241-7536</orcidid><orcidid>https://orcid.org/0009-0003-0152-9571</orcidid><orcidid>https://orcid.org/0000-0003-0061-3475</orcidid><orcidid>https://orcid.org/0009-0007-9023-0465</orcidid></search><sort><creationdate>202412</creationdate><title>Automatic Generation and Optimization Framework of NoC-Based Neural Network Accelerator Through Reinforcement Learning</title><author>Xue, Yongqi ; Ji, Jinlun ; Yu, Xinming ; Zhou, Shize ; Li, Siyue ; Li, Xinyi ; Cheng, Tong ; Li, Shiping ; Chen, Kai ; Lu, Zhonghai ; Li, Li ; Fu, Yuxiang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c146t-7e756f689abe6920e0fe982190a05d5f9a4baa08cb296bde68a2ce028aa4a9c23</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Artificial neural networks</topic><topic>Biological neural networks</topic><topic>Computer architecture</topic><topic>Hardware</topic><topic>hardware mapping</topic><topic>Network-on-chip</topic><topic>Neural networks</topic><topic>Reinforcement learning</topic><topic>Task analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Xue, Yongqi</creatorcontrib><creatorcontrib>Ji, Jinlun</creatorcontrib><creatorcontrib>Yu, Xinming</creatorcontrib><creatorcontrib>Zhou, Shize</creatorcontrib><creatorcontrib>Li, Siyue</creatorcontrib><creatorcontrib>Li, Xinyi</creatorcontrib><creatorcontrib>Cheng, Tong</creatorcontrib><creatorcontrib>Li, Shiping</creatorcontrib><creatorcontrib>Chen, Kai</creatorcontrib><creatorcontrib>Lu, Zhonghai</creatorcontrib><creatorcontrib>Li, Li</creatorcontrib><creatorcontrib>Fu, Yuxiang</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005–Present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Xplore</collection><collection>CrossRef</collection><jtitle>IEEE transactions on computers</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Xue, Yongqi</au><au>Ji, Jinlun</au><au>Yu, Xinming</au><au>Zhou, Shize</au><au>Li, Siyue</au><au>Li, Xinyi</au><au>Cheng, Tong</au><au>Li, Shiping</au><au>Chen, Kai</au><au>Lu, Zhonghai</au><au>Li, Li</au><au>Fu, Yuxiang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Automatic Generation and Optimization Framework of NoC-Based Neural Network Accelerator Through Reinforcement Learning</atitle><jtitle>IEEE transactions on computers</jtitle><stitle>TC</stitle><date>2024-12</date><risdate>2024</risdate><volume>73</volume><issue>12</issue><spage>2882</spage><epage>2896</epage><pages>2882-2896</pages><issn>0018-9340</issn><eissn>1557-9956</eissn><coden>ITCOB4</coden><abstract><![CDATA[Choices of dataflows, which are known as intra-core neural network (NN) computation loop nest scheduling and inter-core hardware mapping strategies, play a critical role in the performance and energy efficiency of NoC-based neural network accelerators. Confronted with an enormous dataflow exploration space, this paper proposes an automatic framework for generating and optimizing the full-layer-mappings based on two reinforcement learning algorithms including A2C and PPO. Combining soft and hard constraints, this work transforms the mapping configuration into a sequential decision problem and aims to explore the performance and energy efficient hardware mapping for NoC systems. We evaluate the performance of the proposed framework on 10 experimental neural networks. The results show that compared with the direct-X mapping, the direct-Y mapping, GA-base mapping, and NN-aware mapping, our optimization framework reduces the average execution time of 10 experimental NNs by 9.09<inline-formula><tex-math notation="LaTeX">\%</tex-math> <mml:math><mml:mi mathvariant="normal">%</mml:mi></mml:math><inline-graphic xlink:href="fu-ieq1-3441822.gif"/> </inline-formula>, improves the throughput by 11.27<inline-formula><tex-math notation="LaTeX">\%</tex-math> <mml:math><mml:mi mathvariant="normal">%</mml:mi></mml:math><inline-graphic xlink:href="fu-ieq2-3441822.gif"/> </inline-formula>, reduces the energy by 12.62<inline-formula><tex-math notation="LaTeX">\%</tex-math> <mml:math><mml:mi mathvariant="normal">%</mml:mi></mml:math><inline-graphic xlink:href="fu-ieq3-3441822.gif"/> </inline-formula>, and reduces the time-energy-product (TEP) by 14.49<inline-formula><tex-math notation="LaTeX">\%</tex-math> <mml:math><mml:mi mathvariant="normal">%</mml:mi></mml:math><inline-graphic xlink:href="fu-ieq4-3441822.gif"/> </inline-formula>. The results also show that the performance enhancement is related to the coefficient of variation of the neural network to be computed.]]></abstract><pub>IEEE</pub><doi>10.1109/TC.2024.3441822</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0003-1351-5460</orcidid><orcidid>https://orcid.org/0009-0007-2099-040X</orcidid><orcidid>https://orcid.org/0009-0007-5630-1514</orcidid><orcidid>https://orcid.org/0009-0009-5260-5726</orcidid><orcidid>https://orcid.org/0009-0003-2877-2387</orcidid><orcidid>https://orcid.org/0000-0002-1047-6067</orcidid><orcidid>https://orcid.org/0009-0007-9970-804X</orcidid><orcidid>https://orcid.org/0009-0004-4649-041X</orcidid><orcidid>https://orcid.org/0000-0001-8241-7536</orcidid><orcidid>https://orcid.org/0009-0003-0152-9571</orcidid><orcidid>https://orcid.org/0000-0003-0061-3475</orcidid><orcidid>https://orcid.org/0009-0007-9023-0465</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0018-9340
ispartof IEEE transactions on computers, 2024-12, Vol.73 (12), p.2882-2896
issn 0018-9340
1557-9956
language eng
recordid cdi_ieee_primary_10633899
source IEEE Electronic Library (IEL) Journals
subjects Artificial neural networks
Biological neural networks
Computer architecture
Hardware
hardware mapping
Network-on-chip
Neural networks
Reinforcement learning
Task analysis
title Automatic Generation and Optimization Framework of NoC-Based Neural Network Accelerator Through Reinforcement Learning
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-03T14%3A44%3A11IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Automatic%20Generation%20and%20Optimization%20Framework%20of%20NoC-Based%20Neural%20Network%20Accelerator%20Through%20Reinforcement%20Learning&rft.jtitle=IEEE%20transactions%20on%20computers&rft.au=Xue,%20Yongqi&rft.date=2024-12&rft.volume=73&rft.issue=12&rft.spage=2882&rft.epage=2896&rft.pages=2882-2896&rft.issn=0018-9340&rft.eissn=1557-9956&rft.coden=ITCOB4&rft_id=info:doi/10.1109/TC.2024.3441822&rft_dat=%3Ccrossref_ieee_%3E10_1109_TC_2024_3441822%3C/crossref_ieee_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c146t-7e756f689abe6920e0fe982190a05d5f9a4baa08cb296bde68a2ce028aa4a9c23%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10633899&rfr_iscdi=true