Loading…
A 1920 [Formula Omitted] 1080 25-Frames/s 2.4-TOPS/W Low-Power 6-D Vision Processor for Unified Optical Flow and Stereo Depth With Semi-Global Matching
This paper presents a unified 6-D vision processor that enables dense real-time 3-D depth and 3-D motion perception at full-high-definition ([Formula Omitted], FHD) resolution. The proposed design implements a neighbor-guided semi-global matching (NG-SGM) algorithm to unify the stereo depth and opti...
Saved in:
Published in: | IEEE journal of solid-state circuits 2019-04, Vol.54 (4), p.1048 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | |
container_issue | 4 |
container_start_page | 1048 |
container_title | IEEE journal of solid-state circuits |
container_volume | 54 |
creator | Li, Ziyun Wang, Jingcheng Sylvester, Dennis Blaauw, David Kim, Hun Seok |
description | This paper presents a unified 6-D vision processor that enables dense real-time 3-D depth and 3-D motion perception at full-high-definition ([Formula Omitted], FHD) resolution. The proposed design implements a neighbor-guided semi-global matching (NG-SGM) algorithm to unify the stereo depth and optical flow matching problem and to reduce computation by 98% compared with the original SGM. We introduce a new custom-designed, high-bandwidth coalescing crossbar circuit that automatically coalesces redundant memory accesses to mitigate the highly irregular memory accesses observed in NG-SGM. The proposed 6-D vision processor also maximizes on-chip memory reuse by using 64 on-chip rotating image buffers that cover a wide optical flow and depth disparity search range of 176 pixels per dimension. The processor implements massive parallel processing with 576 compute units that are deeply pipelined with a dependency-resolving skewed-diagonal scan to hide the dynamic and variable dependency in the pipeline. The fabricated processor performs dense NG-SGM at 25 frames/s for optical flow or 30 frames/s for stereo depth at FHD resolution while consuming only 760 mW in 28-nm CMOS. |
doi_str_mv | 10.1109/JSSC.2018.2885559 |
format | article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2200823040</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2200823040</sourcerecordid><originalsourceid>FETCH-LOGICAL-p98t-ae5f47b87cbad03bdf8b8395a3bfa2e868c52e740a8283909f54ad106345b8363</originalsourceid><addsrcrecordid>eNotUF1LwzAUDaLgnP4A3y74nC5JmzZ9HNNOZdJBpxNERtqmLqNtatKxn-LfNaAP917OB_fAQeiWkoBSks6ei2IRMEJFwITgnKdnaEI5F5gm4fs5mhAv4ZQRcomunDt4GEWCTtDPHKin4SMztju2EvJOj6OqP4ESQYBxnFnZKTdzwIIIb_J1MdvCypzw2pyUhRjfw5t22vSwtqZSzhkLjZ_XXjda1ZAPo65kC1lrTiD7GopRWWXgXg3jHrbar0J1Gi9bU3rbixyrve6_rtFFI1unbv7vFG2yh83iEa_y5dNivsJDKkYsFW-ipBRJVcqahGXdiFKEKZdh2UimRCwqzlQSESmY50na8EjWlMRhxL0xDqfo7u_tYM33UblxdzBH2_vEHfNlCRaSiIS_Zutl4A</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2200823040</pqid></control><display><type>article</type><title>A 1920 [Formula Omitted] 1080 25-Frames/s 2.4-TOPS/W Low-Power 6-D Vision Processor for Unified Optical Flow and Stereo Depth With Semi-Global Matching</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Li, Ziyun ; Wang, Jingcheng ; Sylvester, Dennis ; Blaauw, David ; Kim, Hun Seok</creator><creatorcontrib>Li, Ziyun ; Wang, Jingcheng ; Sylvester, Dennis ; Blaauw, David ; Kim, Hun Seok</creatorcontrib><description>This paper presents a unified 6-D vision processor that enables dense real-time 3-D depth and 3-D motion perception at full-high-definition ([Formula Omitted], FHD) resolution. The proposed design implements a neighbor-guided semi-global matching (NG-SGM) algorithm to unify the stereo depth and optical flow matching problem and to reduce computation by 98% compared with the original SGM. We introduce a new custom-designed, high-bandwidth coalescing crossbar circuit that automatically coalesces redundant memory accesses to mitigate the highly irregular memory accesses observed in NG-SGM. The proposed 6-D vision processor also maximizes on-chip memory reuse by using 64 on-chip rotating image buffers that cover a wide optical flow and depth disparity search range of 176 pixels per dimension. The processor implements massive parallel processing with 576 compute units that are deeply pipelined with a dependency-resolving skewed-diagonal scan to hide the dynamic and variable dependency in the pipeline. The fabricated processor performs dense NG-SGM at 25 frames/s for optical flow or 30 frames/s for stereo depth at FHD resolution while consuming only 760 mW in 28-nm CMOS.</description><identifier>ISSN: 0018-9200</identifier><identifier>EISSN: 1558-173X</identifier><identifier>DOI: 10.1109/JSSC.2018.2885559</identifier><language>eng</language><publisher>New York: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</publisher><subject>Algorithms ; Circuit design ; CMOS ; Coalescing ; Computer terminals ; Dependence ; Frames (data processing) ; Matching ; Microprocessors ; Motion perception ; Optical flow (image analysis) ; Parallel processing ; Pedestrians ; Three dimensional motion ; Vision</subject><ispartof>IEEE journal of solid-state circuits, 2019-04, Vol.54 (4), p.1048</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Li, Ziyun</creatorcontrib><creatorcontrib>Wang, Jingcheng</creatorcontrib><creatorcontrib>Sylvester, Dennis</creatorcontrib><creatorcontrib>Blaauw, David</creatorcontrib><creatorcontrib>Kim, Hun Seok</creatorcontrib><title>A 1920 [Formula Omitted] 1080 25-Frames/s 2.4-TOPS/W Low-Power 6-D Vision Processor for Unified Optical Flow and Stereo Depth With Semi-Global Matching</title><title>IEEE journal of solid-state circuits</title><description>This paper presents a unified 6-D vision processor that enables dense real-time 3-D depth and 3-D motion perception at full-high-definition ([Formula Omitted], FHD) resolution. The proposed design implements a neighbor-guided semi-global matching (NG-SGM) algorithm to unify the stereo depth and optical flow matching problem and to reduce computation by 98% compared with the original SGM. We introduce a new custom-designed, high-bandwidth coalescing crossbar circuit that automatically coalesces redundant memory accesses to mitigate the highly irregular memory accesses observed in NG-SGM. The proposed 6-D vision processor also maximizes on-chip memory reuse by using 64 on-chip rotating image buffers that cover a wide optical flow and depth disparity search range of 176 pixels per dimension. The processor implements massive parallel processing with 576 compute units that are deeply pipelined with a dependency-resolving skewed-diagonal scan to hide the dynamic and variable dependency in the pipeline. The fabricated processor performs dense NG-SGM at 25 frames/s for optical flow or 30 frames/s for stereo depth at FHD resolution while consuming only 760 mW in 28-nm CMOS.</description><subject>Algorithms</subject><subject>Circuit design</subject><subject>CMOS</subject><subject>Coalescing</subject><subject>Computer terminals</subject><subject>Dependence</subject><subject>Frames (data processing)</subject><subject>Matching</subject><subject>Microprocessors</subject><subject>Motion perception</subject><subject>Optical flow (image analysis)</subject><subject>Parallel processing</subject><subject>Pedestrians</subject><subject>Three dimensional motion</subject><subject>Vision</subject><issn>0018-9200</issn><issn>1558-173X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><recordid>eNotUF1LwzAUDaLgnP4A3y74nC5JmzZ9HNNOZdJBpxNERtqmLqNtatKxn-LfNaAP917OB_fAQeiWkoBSks6ei2IRMEJFwITgnKdnaEI5F5gm4fs5mhAv4ZQRcomunDt4GEWCTtDPHKin4SMztju2EvJOj6OqP4ESQYBxnFnZKTdzwIIIb_J1MdvCypzw2pyUhRjfw5t22vSwtqZSzhkLjZ_XXjda1ZAPo65kC1lrTiD7GopRWWXgXg3jHrbar0J1Gi9bU3rbixyrve6_rtFFI1unbv7vFG2yh83iEa_y5dNivsJDKkYsFW-ipBRJVcqahGXdiFKEKZdh2UimRCwqzlQSESmY50na8EjWlMRhxL0xDqfo7u_tYM33UblxdzBH2_vEHfNlCRaSiIS_Zutl4A</recordid><startdate>20190401</startdate><enddate>20190401</enddate><creator>Li, Ziyun</creator><creator>Wang, Jingcheng</creator><creator>Sylvester, Dennis</creator><creator>Blaauw, David</creator><creator>Kim, Hun Seok</creator><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope></search><sort><creationdate>20190401</creationdate><title>A 1920 [Formula Omitted] 1080 25-Frames/s 2.4-TOPS/W Low-Power 6-D Vision Processor for Unified Optical Flow and Stereo Depth With Semi-Global Matching</title><author>Li, Ziyun ; Wang, Jingcheng ; Sylvester, Dennis ; Blaauw, David ; Kim, Hun Seok</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-p98t-ae5f47b87cbad03bdf8b8395a3bfa2e868c52e740a8283909f54ad106345b8363</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Algorithms</topic><topic>Circuit design</topic><topic>CMOS</topic><topic>Coalescing</topic><topic>Computer terminals</topic><topic>Dependence</topic><topic>Frames (data processing)</topic><topic>Matching</topic><topic>Microprocessors</topic><topic>Motion perception</topic><topic>Optical flow (image analysis)</topic><topic>Parallel processing</topic><topic>Pedestrians</topic><topic>Three dimensional motion</topic><topic>Vision</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Ziyun</creatorcontrib><creatorcontrib>Wang, Jingcheng</creatorcontrib><creatorcontrib>Sylvester, Dennis</creatorcontrib><creatorcontrib>Blaauw, David</creatorcontrib><creatorcontrib>Kim, Hun Seok</creatorcontrib><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE journal of solid-state circuits</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Li, Ziyun</au><au>Wang, Jingcheng</au><au>Sylvester, Dennis</au><au>Blaauw, David</au><au>Kim, Hun Seok</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A 1920 [Formula Omitted] 1080 25-Frames/s 2.4-TOPS/W Low-Power 6-D Vision Processor for Unified Optical Flow and Stereo Depth With Semi-Global Matching</atitle><jtitle>IEEE journal of solid-state circuits</jtitle><date>2019-04-01</date><risdate>2019</risdate><volume>54</volume><issue>4</issue><spage>1048</spage><pages>1048-</pages><issn>0018-9200</issn><eissn>1558-173X</eissn><abstract>This paper presents a unified 6-D vision processor that enables dense real-time 3-D depth and 3-D motion perception at full-high-definition ([Formula Omitted], FHD) resolution. The proposed design implements a neighbor-guided semi-global matching (NG-SGM) algorithm to unify the stereo depth and optical flow matching problem and to reduce computation by 98% compared with the original SGM. We introduce a new custom-designed, high-bandwidth coalescing crossbar circuit that automatically coalesces redundant memory accesses to mitigate the highly irregular memory accesses observed in NG-SGM. The proposed 6-D vision processor also maximizes on-chip memory reuse by using 64 on-chip rotating image buffers that cover a wide optical flow and depth disparity search range of 176 pixels per dimension. The processor implements massive parallel processing with 576 compute units that are deeply pipelined with a dependency-resolving skewed-diagonal scan to hide the dynamic and variable dependency in the pipeline. The fabricated processor performs dense NG-SGM at 25 frames/s for optical flow or 30 frames/s for stereo depth at FHD resolution while consuming only 760 mW in 28-nm CMOS.</abstract><cop>New York</cop><pub>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</pub><doi>10.1109/JSSC.2018.2885559</doi></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0018-9200 |
ispartof | IEEE journal of solid-state circuits, 2019-04, Vol.54 (4), p.1048 |
issn | 0018-9200 1558-173X |
language | eng |
recordid | cdi_proquest_journals_2200823040 |
source | IEEE Electronic Library (IEL) Journals |
subjects | Algorithms Circuit design CMOS Coalescing Computer terminals Dependence Frames (data processing) Matching Microprocessors Motion perception Optical flow (image analysis) Parallel processing Pedestrians Three dimensional motion Vision |
title | A 1920 [Formula Omitted] 1080 25-Frames/s 2.4-TOPS/W Low-Power 6-D Vision Processor for Unified Optical Flow and Stereo Depth With Semi-Global Matching |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T03%3A04%3A58IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%201920%20%5BFormula%20Omitted%5D%201080%2025-Frames/s%202.4-TOPS/W%20Low-Power%206-D%20Vision%20Processor%20for%20Unified%20Optical%20Flow%20and%20Stereo%20Depth%20With%20Semi-Global%20Matching&rft.jtitle=IEEE%20journal%20of%20solid-state%20circuits&rft.au=Li,%20Ziyun&rft.date=2019-04-01&rft.volume=54&rft.issue=4&rft.spage=1048&rft.pages=1048-&rft.issn=0018-9200&rft.eissn=1558-173X&rft_id=info:doi/10.1109/JSSC.2018.2885559&rft_dat=%3Cproquest%3E2200823040%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-p98t-ae5f47b87cbad03bdf8b8395a3bfa2e868c52e740a8283909f54ad106345b8363%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2200823040&rft_id=info:pmid/&rfr_iscdi=true |