Loading…
TileNET: Scalable Architecture for High-Throughput Ternary Convolution Neural Networks Using FPGAs
Convolution Neural Networks (CNNs) are becoming increasing popular in Advanced driver assistance systems (ADAS) and Autonomated driving (AD) for camera perception enabling multiple applications like object detection, lane detection and semantic segmentation. Ever increasing need for high resolution...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | 462 |
container_issue | |
container_start_page | 461 |
container_title | |
container_volume | |
creator | Vikram, Sahu Sai Pant, Vibha Mody, Mihir Purnaprajna, Madhura |
description | Convolution Neural Networks (CNNs) are becoming increasing popular in Advanced driver assistance systems (ADAS) and Autonomated driving (AD) for camera perception enabling multiple applications like object detection, lane detection and semantic segmentation. Ever increasing need for high resolution multiple cameras around car necessitates a huge-throughput in the order of about few 10's of TeraMACs per second (TMACS) along with high accuracy of detection. Existing implementations do not scale, with performance ranging only in the order of a few Giga operations per second. This paper, proposes a novel tiled architecture for CNNs that uses only ternarized weights, while input and output features are kept full precision resulting in minimal loss of accuracy. The proposed solution is implemented on Virtex-7 FPGA resulting in throughput of 13.76 TOPS. The post-implementation power simulation for AlexNet consumes 16 W, orders of magnitude lower than exist in GPUs. |
doi_str_mv | 10.1109/VLSID.2018.113 |
format | conference_proceeding |
fullrecord | <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_8326976</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8326976</ieee_id><sourcerecordid>8326976</sourcerecordid><originalsourceid>FETCH-LOGICAL-i175t-60c5f6aef8d2b7e8540bb7ec4245f91cb78f4cc3d14889f8d913a8ef1b47f4453</originalsourceid><addsrcrecordid>eNotjl1LwzAYhaMguE1vvfEmf6AzaT6aelfmvqBMYZ23I8netNHajrSd-O8t6NVzDjwcDkIPlMwpJenTe77fvsxjQtXY2RWaUsGUZDKN6TWaxEyRaMzsFk277oMQogRJJsgUvobdsnjGe6trbWrAWbCV78H2QwDs2oA3vqyiogrtUFbnoccFhEaHH7xom0tbD71vG7yDIeh6RP_dhs8OHzrflHj1ts66O3TjdN3B_T9n6LBaFotNlL-ut4ssjzxNRB9JYoWTGpw6xSYBJTgxIy2PuXAptSZRjlvLTpQrlY5WSplW4KjhieNcsBl6_Nv1AHA8B_81njwqFss0kewXqNxU7Q</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>TileNET: Scalable Architecture for High-Throughput Ternary Convolution Neural Networks Using FPGAs</title><source>IEEE Xplore All Conference Series</source><creator>Vikram, Sahu Sai ; Pant, Vibha ; Mody, Mihir ; Purnaprajna, Madhura</creator><creatorcontrib>Vikram, Sahu Sai ; Pant, Vibha ; Mody, Mihir ; Purnaprajna, Madhura</creatorcontrib><description>Convolution Neural Networks (CNNs) are becoming increasing popular in Advanced driver assistance systems (ADAS) and Autonomated driving (AD) for camera perception enabling multiple applications like object detection, lane detection and semantic segmentation. Ever increasing need for high resolution multiple cameras around car necessitates a huge-throughput in the order of about few 10's of TeraMACs per second (TMACS) along with high accuracy of detection. Existing implementations do not scale, with performance ranging only in the order of a few Giga operations per second. This paper, proposes a novel tiled architecture for CNNs that uses only ternarized weights, while input and output features are kept full precision resulting in minimal loss of accuracy. The proposed solution is implemented on Virtex-7 FPGA resulting in throughput of 13.76 TOPS. The post-implementation power simulation for AlexNet consumes 16 W, orders of magnitude lower than exist in GPUs.</description><identifier>EISSN: 2380-6923</identifier><identifier>EISBN: 1538636921</identifier><identifier>EISBN: 9781538636923</identifier><identifier>DOI: 10.1109/VLSID.2018.113</identifier><identifier>CODEN: IEEPAD</identifier><language>eng</language><publisher>IEEE</publisher><subject>Acceleration ; Computer architecture ; Convolution ; Electronic mail ; Field programmable gate arrays ; Neural networks ; Throughput</subject><ispartof>2018 31st International Conference on VLSI Design and 2018 17th International Conference on Embedded Systems (VLSID), 2018, p.461-462</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8326976$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,23930,23931,25140,27925,54555,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8326976$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Vikram, Sahu Sai</creatorcontrib><creatorcontrib>Pant, Vibha</creatorcontrib><creatorcontrib>Mody, Mihir</creatorcontrib><creatorcontrib>Purnaprajna, Madhura</creatorcontrib><title>TileNET: Scalable Architecture for High-Throughput Ternary Convolution Neural Networks Using FPGAs</title><title>2018 31st International Conference on VLSI Design and 2018 17th International Conference on Embedded Systems (VLSID)</title><addtitle>ICVD</addtitle><description>Convolution Neural Networks (CNNs) are becoming increasing popular in Advanced driver assistance systems (ADAS) and Autonomated driving (AD) for camera perception enabling multiple applications like object detection, lane detection and semantic segmentation. Ever increasing need for high resolution multiple cameras around car necessitates a huge-throughput in the order of about few 10's of TeraMACs per second (TMACS) along with high accuracy of detection. Existing implementations do not scale, with performance ranging only in the order of a few Giga operations per second. This paper, proposes a novel tiled architecture for CNNs that uses only ternarized weights, while input and output features are kept full precision resulting in minimal loss of accuracy. The proposed solution is implemented on Virtex-7 FPGA resulting in throughput of 13.76 TOPS. The post-implementation power simulation for AlexNet consumes 16 W, orders of magnitude lower than exist in GPUs.</description><subject>Acceleration</subject><subject>Computer architecture</subject><subject>Convolution</subject><subject>Electronic mail</subject><subject>Field programmable gate arrays</subject><subject>Neural networks</subject><subject>Throughput</subject><issn>2380-6923</issn><isbn>1538636921</isbn><isbn>9781538636923</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2018</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotjl1LwzAYhaMguE1vvfEmf6AzaT6aelfmvqBMYZ23I8netNHajrSd-O8t6NVzDjwcDkIPlMwpJenTe77fvsxjQtXY2RWaUsGUZDKN6TWaxEyRaMzsFk277oMQogRJJsgUvobdsnjGe6trbWrAWbCV78H2QwDs2oA3vqyiogrtUFbnoccFhEaHH7xom0tbD71vG7yDIeh6RP_dhs8OHzrflHj1ts66O3TjdN3B_T9n6LBaFotNlL-ut4ssjzxNRB9JYoWTGpw6xSYBJTgxIy2PuXAptSZRjlvLTpQrlY5WSplW4KjhieNcsBl6_Nv1AHA8B_81njwqFss0kewXqNxU7Q</recordid><startdate>201801</startdate><enddate>201801</enddate><creator>Vikram, Sahu Sai</creator><creator>Pant, Vibha</creator><creator>Mody, Mihir</creator><creator>Purnaprajna, Madhura</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>201801</creationdate><title>TileNET: Scalable Architecture for High-Throughput Ternary Convolution Neural Networks Using FPGAs</title><author>Vikram, Sahu Sai ; Pant, Vibha ; Mody, Mihir ; Purnaprajna, Madhura</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i175t-60c5f6aef8d2b7e8540bb7ec4245f91cb78f4cc3d14889f8d913a8ef1b47f4453</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Acceleration</topic><topic>Computer architecture</topic><topic>Convolution</topic><topic>Electronic mail</topic><topic>Field programmable gate arrays</topic><topic>Neural networks</topic><topic>Throughput</topic><toplevel>online_resources</toplevel><creatorcontrib>Vikram, Sahu Sai</creatorcontrib><creatorcontrib>Pant, Vibha</creatorcontrib><creatorcontrib>Mody, Mihir</creatorcontrib><creatorcontrib>Purnaprajna, Madhura</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEL</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Vikram, Sahu Sai</au><au>Pant, Vibha</au><au>Mody, Mihir</au><au>Purnaprajna, Madhura</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>TileNET: Scalable Architecture for High-Throughput Ternary Convolution Neural Networks Using FPGAs</atitle><btitle>2018 31st International Conference on VLSI Design and 2018 17th International Conference on Embedded Systems (VLSID)</btitle><stitle>ICVD</stitle><date>2018-01</date><risdate>2018</risdate><spage>461</spage><epage>462</epage><pages>461-462</pages><eissn>2380-6923</eissn><eisbn>1538636921</eisbn><eisbn>9781538636923</eisbn><coden>IEEPAD</coden><abstract>Convolution Neural Networks (CNNs) are becoming increasing popular in Advanced driver assistance systems (ADAS) and Autonomated driving (AD) for camera perception enabling multiple applications like object detection, lane detection and semantic segmentation. Ever increasing need for high resolution multiple cameras around car necessitates a huge-throughput in the order of about few 10's of TeraMACs per second (TMACS) along with high accuracy of detection. Existing implementations do not scale, with performance ranging only in the order of a few Giga operations per second. This paper, proposes a novel tiled architecture for CNNs that uses only ternarized weights, while input and output features are kept full precision resulting in minimal loss of accuracy. The proposed solution is implemented on Virtex-7 FPGA resulting in throughput of 13.76 TOPS. The post-implementation power simulation for AlexNet consumes 16 W, orders of magnitude lower than exist in GPUs.</abstract><pub>IEEE</pub><doi>10.1109/VLSID.2018.113</doi><tpages>2</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | EISSN: 2380-6923 |
ispartof | 2018 31st International Conference on VLSI Design and 2018 17th International Conference on Embedded Systems (VLSID), 2018, p.461-462 |
issn | 2380-6923 |
language | eng |
recordid | cdi_ieee_primary_8326976 |
source | IEEE Xplore All Conference Series |
subjects | Acceleration Computer architecture Convolution Electronic mail Field programmable gate arrays Neural networks Throughput |
title | TileNET: Scalable Architecture for High-Throughput Ternary Convolution Neural Networks Using FPGAs |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T05%3A50%3A12IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=TileNET:%20Scalable%20Architecture%20for%20High-Throughput%20Ternary%20Convolution%20Neural%20Networks%20Using%20FPGAs&rft.btitle=2018%2031st%20International%20Conference%20on%20VLSI%20Design%20and%202018%2017th%20International%20Conference%20on%20Embedded%20Systems%20(VLSID)&rft.au=Vikram,%20Sahu%20Sai&rft.date=2018-01&rft.spage=461&rft.epage=462&rft.pages=461-462&rft.eissn=2380-6923&rft.coden=IEEPAD&rft_id=info:doi/10.1109/VLSID.2018.113&rft.eisbn=1538636921&rft.eisbn_list=9781538636923&rft_dat=%3Cieee_CHZPO%3E8326976%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i175t-60c5f6aef8d2b7e8540bb7ec4245f91cb78f4cc3d14889f8d913a8ef1b47f4453%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=8326976&rfr_iscdi=true |