Loading…
ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars
A number of recent efforts have attempted to design accelerators for popular machine learning algorithms, such as those involving convolutional and deep neural networks (CNNs and DNNs). These algorithms typically involve a large number of multiply-accumulate (dot-product) operations. A recent projec...
Saved in:
Main Authors: | , , , , , , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Citations: | Items that cite this one |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-a327t-65665a9f4f9bfd6a379829a736a66f994be82c2d150c72aa7327047d0b365daf3 |
---|---|
cites | |
container_end_page | 26 |
container_issue | |
container_start_page | 14 |
container_title | |
container_volume | |
creator | Shafiee, Ali Nag, Anirban Muralimanohar, Naveen Balasubramonian, Rajeev Strachan, John Paul Miao Hu Williams, R. Stanley Srikumar, Vivek |
description | A number of recent efforts have attempted to design accelerators for popular machine learning algorithms, such as those involving convolutional and deep neural networks (CNNs and DNNs). These algorithms typically involve a large number of multiply-accumulate (dot-product) operations. A recent project, DaDianNao, adopts a near data processing approach, where a specialized neural functional unit performs all the digital arithmetic operations and receives input weights from adjacent eDRAM banks. This work explores an in-situ processing approach, where memristor crossbar arrays not only store input weights, but are also used to perform dot-product operations in an analog manner. While the use of crossbar memory as an analog dot-product engine is well known, no prior work has designed or characterized a full-fledged accelerator based on crossbars. In particular, our work makes the following contributions: (i) We design a pipelined architecture, with some crossbars dedicated for each neural network layer, and eDRAM buffers that aggregate data between pipeline stages. (ii) We define new data encoding techniques that are amenable to analog computations and that can reduce the high overheads of analog-to-digital conversion (ADC). (iii) We define the many supporting digital components required in an analog CNN accelerator and carry out a design space exploration to identify the best balance of memristor storage/compute, ADCs, and eDRAM storage on a chip. On a suite of CNN and DNN workloads, the proposed ISAAC architecture yields improvements of 14.8×, 5.5×, and 7.5× in throughput, energy, and computational density (respectively), relative to the state-of-the-art DaDianNao architecture. |
doi_str_mv | 10.1109/ISCA.2016.12 |
format | conference_proceeding |
fullrecord | <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_7551379</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>7551379</ieee_id><sourcerecordid>7551379</sourcerecordid><originalsourceid>FETCH-LOGICAL-a327t-65665a9f4f9bfd6a379829a736a66f994be82c2d150c72aa7327047d0b365daf3</originalsourceid><addsrcrecordid>eNotjM1KxDAYRaMoOI6zc-cmL9AxP833Ne5C8acw6GIU3MiQtqlGO42kGQff3qKuDtx7OIScc7bknOnLal2apWAcllwckIXGgueAstA58kMyEwpVhlw-H5EZZyAzKDSekNNxfGeMa61gRl6qtTHlFTW0DMNX6HfJh8H29N7t4i_SPsQPaprG9S7aFCLd-_RGqyFb-7SjZpLDKzVxGrcu-Yb6gZYxjGNt43hGjjvbj27xzzl5url-LO-y1cNtVZpVZqXAlIECUFZ3eafrrgUrURdCW5RgATqt89oVohEtV6xBYadDIMuxZbUE1dpOzsnFX9c75zaf0W9t_N6gUnxKyR99k1PS</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars</title><source>IEEE Xplore All Conference Series</source><creator>Shafiee, Ali ; Nag, Anirban ; Muralimanohar, Naveen ; Balasubramonian, Rajeev ; Strachan, John Paul ; Miao Hu ; Williams, R. Stanley ; Srikumar, Vivek</creator><creatorcontrib>Shafiee, Ali ; Nag, Anirban ; Muralimanohar, Naveen ; Balasubramonian, Rajeev ; Strachan, John Paul ; Miao Hu ; Williams, R. Stanley ; Srikumar, Vivek</creatorcontrib><description>A number of recent efforts have attempted to design accelerators for popular machine learning algorithms, such as those involving convolutional and deep neural networks (CNNs and DNNs). These algorithms typically involve a large number of multiply-accumulate (dot-product) operations. A recent project, DaDianNao, adopts a near data processing approach, where a specialized neural functional unit performs all the digital arithmetic operations and receives input weights from adjacent eDRAM banks. This work explores an in-situ processing approach, where memristor crossbar arrays not only store input weights, but are also used to perform dot-product operations in an analog manner. While the use of crossbar memory as an analog dot-product engine is well known, no prior work has designed or characterized a full-fledged accelerator based on crossbars. In particular, our work makes the following contributions: (i) We design a pipelined architecture, with some crossbars dedicated for each neural network layer, and eDRAM buffers that aggregate data between pipeline stages. (ii) We define new data encoding techniques that are amenable to analog computations and that can reduce the high overheads of analog-to-digital conversion (ADC). (iii) We define the many supporting digital components required in an analog CNN accelerator and carry out a design space exploration to identify the best balance of memristor storage/compute, ADCs, and eDRAM storage on a chip. On a suite of CNN and DNN workloads, the proposed ISAAC architecture yields improvements of 14.8×, 5.5×, and 7.5× in throughput, energy, and computational density (respectively), relative to the state-of-the-art DaDianNao architecture.</description><identifier>ISSN: 1063-6897</identifier><identifier>EISSN: 2575-713X</identifier><identifier>EISBN: 9781467389471</identifier><identifier>EISBN: 1467389471</identifier><identifier>DOI: 10.1109/ISCA.2016.12</identifier><identifier>CODEN: IEEPAD</identifier><language>eng</language><publisher>IEEE</publisher><subject>accelerator ; analog ; Biological neural networks ; CNN ; Computer architecture ; DNN ; Kernel ; Machine learning algorithms ; memristor ; Memristors ; neural ; Neurons ; Pipelines</subject><ispartof>2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), 2016, p.14-26</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-a327t-65665a9f4f9bfd6a379829a736a66f994be82c2d150c72aa7327047d0b365daf3</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/7551379$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,27925,54555,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/7551379$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Shafiee, Ali</creatorcontrib><creatorcontrib>Nag, Anirban</creatorcontrib><creatorcontrib>Muralimanohar, Naveen</creatorcontrib><creatorcontrib>Balasubramonian, Rajeev</creatorcontrib><creatorcontrib>Strachan, John Paul</creatorcontrib><creatorcontrib>Miao Hu</creatorcontrib><creatorcontrib>Williams, R. Stanley</creatorcontrib><creatorcontrib>Srikumar, Vivek</creatorcontrib><title>ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars</title><title>2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA)</title><addtitle>ISCA</addtitle><description>A number of recent efforts have attempted to design accelerators for popular machine learning algorithms, such as those involving convolutional and deep neural networks (CNNs and DNNs). These algorithms typically involve a large number of multiply-accumulate (dot-product) operations. A recent project, DaDianNao, adopts a near data processing approach, where a specialized neural functional unit performs all the digital arithmetic operations and receives input weights from adjacent eDRAM banks. This work explores an in-situ processing approach, where memristor crossbar arrays not only store input weights, but are also used to perform dot-product operations in an analog manner. While the use of crossbar memory as an analog dot-product engine is well known, no prior work has designed or characterized a full-fledged accelerator based on crossbars. In particular, our work makes the following contributions: (i) We design a pipelined architecture, with some crossbars dedicated for each neural network layer, and eDRAM buffers that aggregate data between pipeline stages. (ii) We define new data encoding techniques that are amenable to analog computations and that can reduce the high overheads of analog-to-digital conversion (ADC). (iii) We define the many supporting digital components required in an analog CNN accelerator and carry out a design space exploration to identify the best balance of memristor storage/compute, ADCs, and eDRAM storage on a chip. On a suite of CNN and DNN workloads, the proposed ISAAC architecture yields improvements of 14.8×, 5.5×, and 7.5× in throughput, energy, and computational density (respectively), relative to the state-of-the-art DaDianNao architecture.</description><subject>accelerator</subject><subject>analog</subject><subject>Biological neural networks</subject><subject>CNN</subject><subject>Computer architecture</subject><subject>DNN</subject><subject>Kernel</subject><subject>Machine learning algorithms</subject><subject>memristor</subject><subject>Memristors</subject><subject>neural</subject><subject>Neurons</subject><subject>Pipelines</subject><issn>1063-6897</issn><issn>2575-713X</issn><isbn>9781467389471</isbn><isbn>1467389471</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2016</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotjM1KxDAYRaMoOI6zc-cmL9AxP833Ne5C8acw6GIU3MiQtqlGO42kGQff3qKuDtx7OIScc7bknOnLal2apWAcllwckIXGgueAstA58kMyEwpVhlw-H5EZZyAzKDSekNNxfGeMa61gRl6qtTHlFTW0DMNX6HfJh8H29N7t4i_SPsQPaprG9S7aFCLd-_RGqyFb-7SjZpLDKzVxGrcu-Yb6gZYxjGNt43hGjjvbj27xzzl5url-LO-y1cNtVZpVZqXAlIECUFZ3eafrrgUrURdCW5RgATqt89oVohEtV6xBYadDIMuxZbUE1dpOzsnFX9c75zaf0W9t_N6gUnxKyR99k1PS</recordid><startdate>201606</startdate><enddate>201606</enddate><creator>Shafiee, Ali</creator><creator>Nag, Anirban</creator><creator>Muralimanohar, Naveen</creator><creator>Balasubramonian, Rajeev</creator><creator>Strachan, John Paul</creator><creator>Miao Hu</creator><creator>Williams, R. Stanley</creator><creator>Srikumar, Vivek</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope></search><sort><creationdate>201606</creationdate><title>ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars</title><author>Shafiee, Ali ; Nag, Anirban ; Muralimanohar, Naveen ; Balasubramonian, Rajeev ; Strachan, John Paul ; Miao Hu ; Williams, R. Stanley ; Srikumar, Vivek</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a327t-65665a9f4f9bfd6a379829a736a66f994be82c2d150c72aa7327047d0b365daf3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2016</creationdate><topic>accelerator</topic><topic>analog</topic><topic>Biological neural networks</topic><topic>CNN</topic><topic>Computer architecture</topic><topic>DNN</topic><topic>Kernel</topic><topic>Machine learning algorithms</topic><topic>memristor</topic><topic>Memristors</topic><topic>neural</topic><topic>Neurons</topic><topic>Pipelines</topic><toplevel>online_resources</toplevel><creatorcontrib>Shafiee, Ali</creatorcontrib><creatorcontrib>Nag, Anirban</creatorcontrib><creatorcontrib>Muralimanohar, Naveen</creatorcontrib><creatorcontrib>Balasubramonian, Rajeev</creatorcontrib><creatorcontrib>Strachan, John Paul</creatorcontrib><creatorcontrib>Miao Hu</creatorcontrib><creatorcontrib>Williams, R. Stanley</creatorcontrib><creatorcontrib>Srikumar, Vivek</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Shafiee, Ali</au><au>Nag, Anirban</au><au>Muralimanohar, Naveen</au><au>Balasubramonian, Rajeev</au><au>Strachan, John Paul</au><au>Miao Hu</au><au>Williams, R. Stanley</au><au>Srikumar, Vivek</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars</atitle><btitle>2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA)</btitle><stitle>ISCA</stitle><date>2016-06</date><risdate>2016</risdate><spage>14</spage><epage>26</epage><pages>14-26</pages><issn>1063-6897</issn><eissn>2575-713X</eissn><eisbn>9781467389471</eisbn><eisbn>1467389471</eisbn><coden>IEEPAD</coden><abstract>A number of recent efforts have attempted to design accelerators for popular machine learning algorithms, such as those involving convolutional and deep neural networks (CNNs and DNNs). These algorithms typically involve a large number of multiply-accumulate (dot-product) operations. A recent project, DaDianNao, adopts a near data processing approach, where a specialized neural functional unit performs all the digital arithmetic operations and receives input weights from adjacent eDRAM banks. This work explores an in-situ processing approach, where memristor crossbar arrays not only store input weights, but are also used to perform dot-product operations in an analog manner. While the use of crossbar memory as an analog dot-product engine is well known, no prior work has designed or characterized a full-fledged accelerator based on crossbars. In particular, our work makes the following contributions: (i) We design a pipelined architecture, with some crossbars dedicated for each neural network layer, and eDRAM buffers that aggregate data between pipeline stages. (ii) We define new data encoding techniques that are amenable to analog computations and that can reduce the high overheads of analog-to-digital conversion (ADC). (iii) We define the many supporting digital components required in an analog CNN accelerator and carry out a design space exploration to identify the best balance of memristor storage/compute, ADCs, and eDRAM storage on a chip. On a suite of CNN and DNN workloads, the proposed ISAAC architecture yields improvements of 14.8×, 5.5×, and 7.5× in throughput, energy, and computational density (respectively), relative to the state-of-the-art DaDianNao architecture.</abstract><pub>IEEE</pub><doi>10.1109/ISCA.2016.12</doi><tpages>13</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1063-6897 |
ispartof | 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), 2016, p.14-26 |
issn | 1063-6897 2575-713X |
language | eng |
recordid | cdi_ieee_primary_7551379 |
source | IEEE Xplore All Conference Series |
subjects | accelerator analog Biological neural networks CNN Computer architecture DNN Kernel Machine learning algorithms memristor Memristors neural Neurons Pipelines |
title | ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T08%3A00%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=ISAAC:%20A%20Convolutional%20Neural%20Network%20Accelerator%20with%20In-Situ%20Analog%20Arithmetic%20in%20Crossbars&rft.btitle=2016%20ACM/IEEE%2043rd%20Annual%20International%20Symposium%20on%20Computer%20Architecture%20(ISCA)&rft.au=Shafiee,%20Ali&rft.date=2016-06&rft.spage=14&rft.epage=26&rft.pages=14-26&rft.issn=1063-6897&rft.eissn=2575-713X&rft.coden=IEEPAD&rft_id=info:doi/10.1109/ISCA.2016.12&rft.eisbn=9781467389471&rft.eisbn_list=1467389471&rft_dat=%3Cieee_CHZPO%3E7551379%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-a327t-65665a9f4f9bfd6a379829a736a66f994be82c2d150c72aa7327047d0b365daf3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=7551379&rfr_iscdi=true |