Loading…

A 1920 [Formula Omitted] 1080 25-Frames/s 2.4-TOPS/W Low-Power 6-D Vision Processor for Unified Optical Flow and Stereo Depth With Semi-Global Matching

This paper presents a unified 6-D vision processor that enables dense real-time 3-D depth and 3-D motion perception at full-high-definition ([Formula Omitted], FHD) resolution. The proposed design implements a neighbor-guided semi-global matching (NG-SGM) algorithm to unify the stereo depth and opti...

Full description

Saved in:
Bibliographic Details
Published in:IEEE journal of solid-state circuits 2019-04, Vol.54 (4), p.1048
Main Authors: Li, Ziyun, Wang, Jingcheng, Sylvester, Dennis, Blaauw, David, Kim, Hun Seok
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page
container_issue 4
container_start_page 1048
container_title IEEE journal of solid-state circuits
container_volume 54
creator Li, Ziyun
Wang, Jingcheng
Sylvester, Dennis
Blaauw, David
Kim, Hun Seok
description This paper presents a unified 6-D vision processor that enables dense real-time 3-D depth and 3-D motion perception at full-high-definition ([Formula Omitted], FHD) resolution. The proposed design implements a neighbor-guided semi-global matching (NG-SGM) algorithm to unify the stereo depth and optical flow matching problem and to reduce computation by 98% compared with the original SGM. We introduce a new custom-designed, high-bandwidth coalescing crossbar circuit that automatically coalesces redundant memory accesses to mitigate the highly irregular memory accesses observed in NG-SGM. The proposed 6-D vision processor also maximizes on-chip memory reuse by using 64 on-chip rotating image buffers that cover a wide optical flow and depth disparity search range of 176 pixels per dimension. The processor implements massive parallel processing with 576 compute units that are deeply pipelined with a dependency-resolving skewed-diagonal scan to hide the dynamic and variable dependency in the pipeline. The fabricated processor performs dense NG-SGM at 25 frames/s for optical flow or 30 frames/s for stereo depth at FHD resolution while consuming only 760 mW in 28-nm CMOS.
doi_str_mv 10.1109/JSSC.2018.2885559
format article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2200823040</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2200823040</sourcerecordid><originalsourceid>FETCH-LOGICAL-p98t-ae5f47b87cbad03bdf8b8395a3bfa2e868c52e740a8283909f54ad106345b8363</originalsourceid><addsrcrecordid>eNotUF1LwzAUDaLgnP4A3y74nC5JmzZ9HNNOZdJBpxNERtqmLqNtatKxn-LfNaAP917OB_fAQeiWkoBSks6ei2IRMEJFwITgnKdnaEI5F5gm4fs5mhAv4ZQRcomunDt4GEWCTtDPHKin4SMztju2EvJOj6OqP4ESQYBxnFnZKTdzwIIIb_J1MdvCypzw2pyUhRjfw5t22vSwtqZSzhkLjZ_XXjda1ZAPo65kC1lrTiD7GopRWWXgXg3jHrbar0J1Gi9bU3rbixyrve6_rtFFI1unbv7vFG2yh83iEa_y5dNivsJDKkYsFW-ipBRJVcqahGXdiFKEKZdh2UimRCwqzlQSESmY50na8EjWlMRhxL0xDqfo7u_tYM33UblxdzBH2_vEHfNlCRaSiIS_Zutl4A</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2200823040</pqid></control><display><type>article</type><title>A 1920 [Formula Omitted] 1080 25-Frames/s 2.4-TOPS/W Low-Power 6-D Vision Processor for Unified Optical Flow and Stereo Depth With Semi-Global Matching</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Li, Ziyun ; Wang, Jingcheng ; Sylvester, Dennis ; Blaauw, David ; Kim, Hun Seok</creator><creatorcontrib>Li, Ziyun ; Wang, Jingcheng ; Sylvester, Dennis ; Blaauw, David ; Kim, Hun Seok</creatorcontrib><description>This paper presents a unified 6-D vision processor that enables dense real-time 3-D depth and 3-D motion perception at full-high-definition ([Formula Omitted], FHD) resolution. The proposed design implements a neighbor-guided semi-global matching (NG-SGM) algorithm to unify the stereo depth and optical flow matching problem and to reduce computation by 98% compared with the original SGM. We introduce a new custom-designed, high-bandwidth coalescing crossbar circuit that automatically coalesces redundant memory accesses to mitigate the highly irregular memory accesses observed in NG-SGM. The proposed 6-D vision processor also maximizes on-chip memory reuse by using 64 on-chip rotating image buffers that cover a wide optical flow and depth disparity search range of 176 pixels per dimension. The processor implements massive parallel processing with 576 compute units that are deeply pipelined with a dependency-resolving skewed-diagonal scan to hide the dynamic and variable dependency in the pipeline. The fabricated processor performs dense NG-SGM at 25 frames/s for optical flow or 30 frames/s for stereo depth at FHD resolution while consuming only 760 mW in 28-nm CMOS.</description><identifier>ISSN: 0018-9200</identifier><identifier>EISSN: 1558-173X</identifier><identifier>DOI: 10.1109/JSSC.2018.2885559</identifier><language>eng</language><publisher>New York: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</publisher><subject>Algorithms ; Circuit design ; CMOS ; Coalescing ; Computer terminals ; Dependence ; Frames (data processing) ; Matching ; Microprocessors ; Motion perception ; Optical flow (image analysis) ; Parallel processing ; Pedestrians ; Three dimensional motion ; Vision</subject><ispartof>IEEE journal of solid-state circuits, 2019-04, Vol.54 (4), p.1048</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Li, Ziyun</creatorcontrib><creatorcontrib>Wang, Jingcheng</creatorcontrib><creatorcontrib>Sylvester, Dennis</creatorcontrib><creatorcontrib>Blaauw, David</creatorcontrib><creatorcontrib>Kim, Hun Seok</creatorcontrib><title>A 1920 [Formula Omitted] 1080 25-Frames/s 2.4-TOPS/W Low-Power 6-D Vision Processor for Unified Optical Flow and Stereo Depth With Semi-Global Matching</title><title>IEEE journal of solid-state circuits</title><description>This paper presents a unified 6-D vision processor that enables dense real-time 3-D depth and 3-D motion perception at full-high-definition ([Formula Omitted], FHD) resolution. The proposed design implements a neighbor-guided semi-global matching (NG-SGM) algorithm to unify the stereo depth and optical flow matching problem and to reduce computation by 98% compared with the original SGM. We introduce a new custom-designed, high-bandwidth coalescing crossbar circuit that automatically coalesces redundant memory accesses to mitigate the highly irregular memory accesses observed in NG-SGM. The proposed 6-D vision processor also maximizes on-chip memory reuse by using 64 on-chip rotating image buffers that cover a wide optical flow and depth disparity search range of 176 pixels per dimension. The processor implements massive parallel processing with 576 compute units that are deeply pipelined with a dependency-resolving skewed-diagonal scan to hide the dynamic and variable dependency in the pipeline. The fabricated processor performs dense NG-SGM at 25 frames/s for optical flow or 30 frames/s for stereo depth at FHD resolution while consuming only 760 mW in 28-nm CMOS.</description><subject>Algorithms</subject><subject>Circuit design</subject><subject>CMOS</subject><subject>Coalescing</subject><subject>Computer terminals</subject><subject>Dependence</subject><subject>Frames (data processing)</subject><subject>Matching</subject><subject>Microprocessors</subject><subject>Motion perception</subject><subject>Optical flow (image analysis)</subject><subject>Parallel processing</subject><subject>Pedestrians</subject><subject>Three dimensional motion</subject><subject>Vision</subject><issn>0018-9200</issn><issn>1558-173X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><recordid>eNotUF1LwzAUDaLgnP4A3y74nC5JmzZ9HNNOZdJBpxNERtqmLqNtatKxn-LfNaAP917OB_fAQeiWkoBSks6ei2IRMEJFwITgnKdnaEI5F5gm4fs5mhAv4ZQRcomunDt4GEWCTtDPHKin4SMztju2EvJOj6OqP4ESQYBxnFnZKTdzwIIIb_J1MdvCypzw2pyUhRjfw5t22vSwtqZSzhkLjZ_XXjda1ZAPo65kC1lrTiD7GopRWWXgXg3jHrbar0J1Gi9bU3rbixyrve6_rtFFI1unbv7vFG2yh83iEa_y5dNivsJDKkYsFW-ipBRJVcqahGXdiFKEKZdh2UimRCwqzlQSESmY50na8EjWlMRhxL0xDqfo7u_tYM33UblxdzBH2_vEHfNlCRaSiIS_Zutl4A</recordid><startdate>20190401</startdate><enddate>20190401</enddate><creator>Li, Ziyun</creator><creator>Wang, Jingcheng</creator><creator>Sylvester, Dennis</creator><creator>Blaauw, David</creator><creator>Kim, Hun Seok</creator><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope></search><sort><creationdate>20190401</creationdate><title>A 1920 [Formula Omitted] 1080 25-Frames/s 2.4-TOPS/W Low-Power 6-D Vision Processor for Unified Optical Flow and Stereo Depth With Semi-Global Matching</title><author>Li, Ziyun ; Wang, Jingcheng ; Sylvester, Dennis ; Blaauw, David ; Kim, Hun Seok</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-p98t-ae5f47b87cbad03bdf8b8395a3bfa2e868c52e740a8283909f54ad106345b8363</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Algorithms</topic><topic>Circuit design</topic><topic>CMOS</topic><topic>Coalescing</topic><topic>Computer terminals</topic><topic>Dependence</topic><topic>Frames (data processing)</topic><topic>Matching</topic><topic>Microprocessors</topic><topic>Motion perception</topic><topic>Optical flow (image analysis)</topic><topic>Parallel processing</topic><topic>Pedestrians</topic><topic>Three dimensional motion</topic><topic>Vision</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Ziyun</creatorcontrib><creatorcontrib>Wang, Jingcheng</creatorcontrib><creatorcontrib>Sylvester, Dennis</creatorcontrib><creatorcontrib>Blaauw, David</creatorcontrib><creatorcontrib>Kim, Hun Seok</creatorcontrib><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE journal of solid-state circuits</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Li, Ziyun</au><au>Wang, Jingcheng</au><au>Sylvester, Dennis</au><au>Blaauw, David</au><au>Kim, Hun Seok</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A 1920 [Formula Omitted] 1080 25-Frames/s 2.4-TOPS/W Low-Power 6-D Vision Processor for Unified Optical Flow and Stereo Depth With Semi-Global Matching</atitle><jtitle>IEEE journal of solid-state circuits</jtitle><date>2019-04-01</date><risdate>2019</risdate><volume>54</volume><issue>4</issue><spage>1048</spage><pages>1048-</pages><issn>0018-9200</issn><eissn>1558-173X</eissn><abstract>This paper presents a unified 6-D vision processor that enables dense real-time 3-D depth and 3-D motion perception at full-high-definition ([Formula Omitted], FHD) resolution. The proposed design implements a neighbor-guided semi-global matching (NG-SGM) algorithm to unify the stereo depth and optical flow matching problem and to reduce computation by 98% compared with the original SGM. We introduce a new custom-designed, high-bandwidth coalescing crossbar circuit that automatically coalesces redundant memory accesses to mitigate the highly irregular memory accesses observed in NG-SGM. The proposed 6-D vision processor also maximizes on-chip memory reuse by using 64 on-chip rotating image buffers that cover a wide optical flow and depth disparity search range of 176 pixels per dimension. The processor implements massive parallel processing with 576 compute units that are deeply pipelined with a dependency-resolving skewed-diagonal scan to hide the dynamic and variable dependency in the pipeline. The fabricated processor performs dense NG-SGM at 25 frames/s for optical flow or 30 frames/s for stereo depth at FHD resolution while consuming only 760 mW in 28-nm CMOS.</abstract><cop>New York</cop><pub>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</pub><doi>10.1109/JSSC.2018.2885559</doi></addata></record>
fulltext fulltext
identifier ISSN: 0018-9200
ispartof IEEE journal of solid-state circuits, 2019-04, Vol.54 (4), p.1048
issn 0018-9200
1558-173X
language eng
recordid cdi_proquest_journals_2200823040
source IEEE Electronic Library (IEL) Journals
subjects Algorithms
Circuit design
CMOS
Coalescing
Computer terminals
Dependence
Frames (data processing)
Matching
Microprocessors
Motion perception
Optical flow (image analysis)
Parallel processing
Pedestrians
Three dimensional motion
Vision
title A 1920 [Formula Omitted] 1080 25-Frames/s 2.4-TOPS/W Low-Power 6-D Vision Processor for Unified Optical Flow and Stereo Depth With Semi-Global Matching
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T03%3A04%3A58IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%201920%20%5BFormula%20Omitted%5D%201080%2025-Frames/s%202.4-TOPS/W%20Low-Power%206-D%20Vision%20Processor%20for%20Unified%20Optical%20Flow%20and%20Stereo%20Depth%20With%20Semi-Global%20Matching&rft.jtitle=IEEE%20journal%20of%20solid-state%20circuits&rft.au=Li,%20Ziyun&rft.date=2019-04-01&rft.volume=54&rft.issue=4&rft.spage=1048&rft.pages=1048-&rft.issn=0018-9200&rft.eissn=1558-173X&rft_id=info:doi/10.1109/JSSC.2018.2885559&rft_dat=%3Cproquest%3E2200823040%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-p98t-ae5f47b87cbad03bdf8b8395a3bfa2e868c52e740a8283909f54ad106345b8363%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2200823040&rft_id=info:pmid/&rfr_iscdi=true