Loading…

Pico-Programmable Neurons to Reduce Computations for Deep Neural Network Accelerators

Deep neural networks (DNNs) have shown impressive success in various fields. As a response to the ever-growing precision demand of DNN applications, more complex computational models are created. The growing computational volume has become a challenge for the power and performance efficiency of DNN...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on very large scale integration (VLSI) systems 2024-07, Vol.32 (7), p.1216-1227
Main Authors: Nahvy, Alireza, Navabi, Zainalabedin
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites cdi_FETCH-LOGICAL-c247t-27d4432deb24ac7921be09f5c92b5d2115e63cbeaf49a2088be7304c28e6348e3
container_end_page 1227
container_issue 7
container_start_page 1216
container_title IEEE transactions on very large scale integration (VLSI) systems
container_volume 32
creator Nahvy, Alireza
Navabi, Zainalabedin
description Deep neural networks (DNNs) have shown impressive success in various fields. As a response to the ever-growing precision demand of DNN applications, more complex computational models are created. The growing computational volume has become a challenge for the power and performance efficiency of DNN accelerators. This article presents a new neural architecture to prevent ineffective and redundant computations by using neurons with memory that have decision-making power. In addition, another local memory is used to keep calculation history for removing redundancy by computational reuse. Sparse computing, as another feature, is supported to remove computations of not only zero weights but also zero bits of each weight. The results on conventional datasets such as IMAGENET show a computational reduction of more than 18 \times -150 \times . This scalable architecture enables 124 GOPS by using 197-mW power.
doi_str_mv 10.1109/TVLSI.2024.3386698
format article
fullrecord <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_ieee_primary_10500759</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10500759</ieee_id><sourcerecordid>3073306203</sourcerecordid><originalsourceid>FETCH-LOGICAL-c247t-27d4432deb24ac7921be09f5c92b5d2115e63cbeaf49a2088be7304c28e6348e3</originalsourceid><addsrcrecordid>eNpNkElPwzAQhS0EEqXwBxCHSJxTxksWH6uyVaqggpar5TgTlJLUwXaE-Peky4G5vNHTezPSR8g1hQmlIO9WH4v3-YQBExPO8zSV-QkZ0STJYjnM6bBDyuOcUTgnF95vAKgQEkZkvayNjZfOfjrdtrpoMHrB3tmtj4KN3rDsDUYz23Z90KHe2ZV10T1it8_pZpDwY91XNDUGG3Q6WOcvyVmlG49XRx2T9ePDavYcL16f5rPpIjZMZCFmWSkEZyUWTGiTSUYLBFklRrIiKRmlCabcFKgrITWDPC8w4yAMywdf5MjH5PZwt3P2u0cf1Mb2bju8VBwyziFlwIcUO6SMs947rFTn6la7X0VB7fCpPT61w6eO-IbSzaFUI-K_QgKQJZL_ATM3bF4</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3073306203</pqid></control><display><type>article</type><title>Pico-Programmable Neurons to Reduce Computations for Deep Neural Network Accelerators</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Nahvy, Alireza ; Navabi, Zainalabedin</creator><creatorcontrib>Nahvy, Alireza ; Navabi, Zainalabedin</creatorcontrib><description>Deep neural networks (DNNs) have shown impressive success in various fields. As a response to the ever-growing precision demand of DNN applications, more complex computational models are created. The growing computational volume has become a challenge for the power and performance efficiency of DNN accelerators. This article presents a new neural architecture to prevent ineffective and redundant computations by using neurons with memory that have decision-making power. In addition, another local memory is used to keep calculation history for removing redundancy by computational reuse. Sparse computing, as another feature, is supported to remove computations of not only zero weights but also zero bits of each weight. The results on conventional datasets such as IMAGENET show a computational reduction of more than &lt;inline-formula&gt; &lt;tex-math notation="LaTeX"&gt;18 \times -150 \times &lt;/tex-math&gt;&lt;/inline-formula&gt;. This scalable architecture enables 124 GOPS by using 197-mW power.</description><identifier>ISSN: 1063-8210</identifier><identifier>EISSN: 1557-9999</identifier><identifier>DOI: 10.1109/TVLSI.2024.3386698</identifier><identifier>CODEN: IEVSE9</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Accelerators ; Arithmetic ; Artificial neural networks ; Computational efficiency ; Computational reuse ; Computer architecture ; Costs ; deep neural networks (DNN) ; Mathematical models ; microprogramed architecture ; multiplication and accumulation (MAC) operation ; Neurons ; Redundancy</subject><ispartof>IEEE transactions on very large scale integration (VLSI) systems, 2024-07, Vol.32 (7), p.1216-1227</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c247t-27d4432deb24ac7921be09f5c92b5d2115e63cbeaf49a2088be7304c28e6348e3</cites><orcidid>0009-0006-5474-4058</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10500759$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27922,27923,54794</link.rule.ids></links><search><creatorcontrib>Nahvy, Alireza</creatorcontrib><creatorcontrib>Navabi, Zainalabedin</creatorcontrib><title>Pico-Programmable Neurons to Reduce Computations for Deep Neural Network Accelerators</title><title>IEEE transactions on very large scale integration (VLSI) systems</title><addtitle>TVLSI</addtitle><description>Deep neural networks (DNNs) have shown impressive success in various fields. As a response to the ever-growing precision demand of DNN applications, more complex computational models are created. The growing computational volume has become a challenge for the power and performance efficiency of DNN accelerators. This article presents a new neural architecture to prevent ineffective and redundant computations by using neurons with memory that have decision-making power. In addition, another local memory is used to keep calculation history for removing redundancy by computational reuse. Sparse computing, as another feature, is supported to remove computations of not only zero weights but also zero bits of each weight. The results on conventional datasets such as IMAGENET show a computational reduction of more than &lt;inline-formula&gt; &lt;tex-math notation="LaTeX"&gt;18 \times -150 \times &lt;/tex-math&gt;&lt;/inline-formula&gt;. This scalable architecture enables 124 GOPS by using 197-mW power.</description><subject>Accelerators</subject><subject>Arithmetic</subject><subject>Artificial neural networks</subject><subject>Computational efficiency</subject><subject>Computational reuse</subject><subject>Computer architecture</subject><subject>Costs</subject><subject>deep neural networks (DNN)</subject><subject>Mathematical models</subject><subject>microprogramed architecture</subject><subject>multiplication and accumulation (MAC) operation</subject><subject>Neurons</subject><subject>Redundancy</subject><issn>1063-8210</issn><issn>1557-9999</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNpNkElPwzAQhS0EEqXwBxCHSJxTxksWH6uyVaqggpar5TgTlJLUwXaE-Peky4G5vNHTezPSR8g1hQmlIO9WH4v3-YQBExPO8zSV-QkZ0STJYjnM6bBDyuOcUTgnF95vAKgQEkZkvayNjZfOfjrdtrpoMHrB3tmtj4KN3rDsDUYz23Z90KHe2ZV10T1it8_pZpDwY91XNDUGG3Q6WOcvyVmlG49XRx2T9ePDavYcL16f5rPpIjZMZCFmWSkEZyUWTGiTSUYLBFklRrIiKRmlCabcFKgrITWDPC8w4yAMywdf5MjH5PZwt3P2u0cf1Mb2bju8VBwyziFlwIcUO6SMs947rFTn6la7X0VB7fCpPT61w6eO-IbSzaFUI-K_QgKQJZL_ATM3bF4</recordid><startdate>20240701</startdate><enddate>20240701</enddate><creator>Nahvy, Alireza</creator><creator>Navabi, Zainalabedin</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0009-0006-5474-4058</orcidid></search><sort><creationdate>20240701</creationdate><title>Pico-Programmable Neurons to Reduce Computations for Deep Neural Network Accelerators</title><author>Nahvy, Alireza ; Navabi, Zainalabedin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c247t-27d4432deb24ac7921be09f5c92b5d2115e63cbeaf49a2088be7304c28e6348e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Accelerators</topic><topic>Arithmetic</topic><topic>Artificial neural networks</topic><topic>Computational efficiency</topic><topic>Computational reuse</topic><topic>Computer architecture</topic><topic>Costs</topic><topic>deep neural networks (DNN)</topic><topic>Mathematical models</topic><topic>microprogramed architecture</topic><topic>multiplication and accumulation (MAC) operation</topic><topic>Neurons</topic><topic>Redundancy</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Nahvy, Alireza</creatorcontrib><creatorcontrib>Navabi, Zainalabedin</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005–Present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE/IET Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on very large scale integration (VLSI) systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Nahvy, Alireza</au><au>Navabi, Zainalabedin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Pico-Programmable Neurons to Reduce Computations for Deep Neural Network Accelerators</atitle><jtitle>IEEE transactions on very large scale integration (VLSI) systems</jtitle><stitle>TVLSI</stitle><date>2024-07-01</date><risdate>2024</risdate><volume>32</volume><issue>7</issue><spage>1216</spage><epage>1227</epage><pages>1216-1227</pages><issn>1063-8210</issn><eissn>1557-9999</eissn><coden>IEVSE9</coden><abstract>Deep neural networks (DNNs) have shown impressive success in various fields. As a response to the ever-growing precision demand of DNN applications, more complex computational models are created. The growing computational volume has become a challenge for the power and performance efficiency of DNN accelerators. This article presents a new neural architecture to prevent ineffective and redundant computations by using neurons with memory that have decision-making power. In addition, another local memory is used to keep calculation history for removing redundancy by computational reuse. Sparse computing, as another feature, is supported to remove computations of not only zero weights but also zero bits of each weight. The results on conventional datasets such as IMAGENET show a computational reduction of more than &lt;inline-formula&gt; &lt;tex-math notation="LaTeX"&gt;18 \times -150 \times &lt;/tex-math&gt;&lt;/inline-formula&gt;. This scalable architecture enables 124 GOPS by using 197-mW power.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TVLSI.2024.3386698</doi><tpages>12</tpages><orcidid>https://orcid.org/0009-0006-5474-4058</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 1063-8210
ispartof IEEE transactions on very large scale integration (VLSI) systems, 2024-07, Vol.32 (7), p.1216-1227
issn 1063-8210
1557-9999
language eng
recordid cdi_ieee_primary_10500759
source IEEE Electronic Library (IEL) Journals
subjects Accelerators
Arithmetic
Artificial neural networks
Computational efficiency
Computational reuse
Computer architecture
Costs
deep neural networks (DNN)
Mathematical models
microprogramed architecture
multiplication and accumulation (MAC) operation
Neurons
Redundancy
title Pico-Programmable Neurons to Reduce Computations for Deep Neural Network Accelerators
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-14T09%3A42%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Pico-Programmable%20Neurons%20to%20Reduce%20Computations%20for%20Deep%20Neural%20Network%20Accelerators&rft.jtitle=IEEE%20transactions%20on%20very%20large%20scale%20integration%20(VLSI)%20systems&rft.au=Nahvy,%20Alireza&rft.date=2024-07-01&rft.volume=32&rft.issue=7&rft.spage=1216&rft.epage=1227&rft.pages=1216-1227&rft.issn=1063-8210&rft.eissn=1557-9999&rft.coden=IEVSE9&rft_id=info:doi/10.1109/TVLSI.2024.3386698&rft_dat=%3Cproquest_ieee_%3E3073306203%3C/proquest_ieee_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c247t-27d4432deb24ac7921be09f5c92b5d2115e63cbeaf49a2088be7304c28e6348e3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3073306203&rft_id=info:pmid/&rft_ieee_id=10500759&rfr_iscdi=true