Loading…

A 4.5Tb/s 3.4Tb/s/W 64×64 switch fabric with self-updating least-recently-granted priority and quality-of-service arbitration in 45nm CMOS

High-speed and low-power routers form the basic building blocks of on-die interconnect fabrics that are critical to overall throughput and energy efficiency of high performance systems. Conventional routers use distinct logic blocks for routing data and handling arbitration. At higher radices, conne...

Full description

Saved in:
Bibliographic Details
Main Authors: Satpathy, S., Sewell, K., Manville, T., Yen-Po Chen, Dreslinski, R., Sylvester, D., Mudge, T., Blaauw, D.
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 480
container_issue
container_start_page 478
container_title
container_volume
creator Satpathy, S.
Sewell, K.
Manville, T.
Yen-Po Chen
Dreslinski, R.
Sylvester, D.
Mudge, T.
Blaauw, D.
description High-speed and low-power routers form the basic building blocks of on-die interconnect fabrics that are critical to overall throughput and energy efficiency of high performance systems. Conventional routers use distinct logic blocks for routing data and handling arbitration. At higher radices, connections between these blocks become a bottleneck, limiting router scalability and degrading performance. Recently, two switch topologies merged the data routing fabric with arbitration control, avoiding this bottleneck. However, relies on centralized control for channel allocation, limiting performance, while restricted to a small set of fixed priorities, rendering input ports prone to starvation. In addition, ever larger CMPs will require continued increases in bandwidth over previous designs. To address these issues, we present a 64x64 single-stage swizzle-switch network (SSN) with 128b data buses (8192 total input/output wires). The SSN can connect any input to any output, including multicast. It has a peak measured throughput of 4.5Tb/s at 1.1V in 45nm SOI CMOS at 25°C. The SSN's key features are: 1) a single-cycle least-recently granted (LRG) priority arbitration technique that reuses the already present input and output data buses and their drivers and sense amps; 2) an additional 4-level message-based priority arbitration for quality of service (QoS) with 2% logic and 3% wiring overhead; 3) a bidirectional bitline repeater that allows the router to scale to >;8000 wires. These features result in a compact fabric (4.06mm2) with throughput gain of 2.1 x over at 3.4Tb/s/W efficiency, which improves to 7.4Tb/s/W at 600mV.
doi_str_mv 10.1109/ISSCC.2012.6177098
format conference_proceeding
fullrecord <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_6177098</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6177098</ieee_id><sourcerecordid>6177098</sourcerecordid><originalsourceid>FETCH-ieee_primary_61770983</originalsourceid><addsrcrecordid>eNp9j01OwzAYRM2fRAq9AGy-CzixY8dOliiigkXFIpVYVk7itEapU2wXlDNwAA7ExQhS2XY1M3rSkwahO0piSkmRPFdVWcYpoWksqJSkyM_QvJA55UIywqTk5yhKmRQ4F0RcoNk_EOwSRYQWDIuMkWs08_6NEJIVIo_Q1wPwOFvViQcW879MXkHwn2_BwX-a0GyhU7UzDUxjC173HT7sWxWM3UCvlQ_Y6Ubb0I9445QNuoW9M4MzYQRlW3g_qH7qeOiw1-7DNBqUq01wk2KwYCzwzO6gXL5Ut-iqU73X82PeoPvF46p8wkZrvZ6sO-XG9fE8O01_AeRhWHQ</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>A 4.5Tb/s 3.4Tb/s/W 64×64 switch fabric with self-updating least-recently-granted priority and quality-of-service arbitration in 45nm CMOS</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Satpathy, S. ; Sewell, K. ; Manville, T. ; Yen-Po Chen ; Dreslinski, R. ; Sylvester, D. ; Mudge, T. ; Blaauw, D.</creator><creatorcontrib>Satpathy, S. ; Sewell, K. ; Manville, T. ; Yen-Po Chen ; Dreslinski, R. ; Sylvester, D. ; Mudge, T. ; Blaauw, D.</creatorcontrib><description>High-speed and low-power routers form the basic building blocks of on-die interconnect fabrics that are critical to overall throughput and energy efficiency of high performance systems. Conventional routers use distinct logic blocks for routing data and handling arbitration. At higher radices, connections between these blocks become a bottleneck, limiting router scalability and degrading performance. Recently, two switch topologies merged the data routing fabric with arbitration control, avoiding this bottleneck. However, relies on centralized control for channel allocation, limiting performance, while restricted to a small set of fixed priorities, rendering input ports prone to starvation. In addition, ever larger CMPs will require continued increases in bandwidth over previous designs. To address these issues, we present a 64x64 single-stage swizzle-switch network (SSN) with 128b data buses (8192 total input/output wires). The SSN can connect any input to any output, including multicast. It has a peak measured throughput of 4.5Tb/s at 1.1V in 45nm SOI CMOS at 25°C. The SSN's key features are: 1) a single-cycle least-recently granted (LRG) priority arbitration technique that reuses the already present input and output data buses and their drivers and sense amps; 2) an additional 4-level message-based priority arbitration for quality of service (QoS) with 2% logic and 3% wiring overhead; 3) a bidirectional bitline repeater that allows the router to scale to &gt;;8000 wires. These features result in a compact fabric (4.06mm2) with throughput gain of 2.1 x over at 3.4Tb/s/W efficiency, which improves to 7.4Tb/s/W at 600mV.</description><identifier>ISSN: 0193-6530</identifier><identifier>ISBN: 1467303763</identifier><identifier>ISBN: 9781467303767</identifier><identifier>EISSN: 2376-8606</identifier><identifier>EISBN: 9781467303774</identifier><identifier>EISBN: 1467303771</identifier><identifier>EISBN: 9781467303743</identifier><identifier>EISBN: 1467303747</identifier><identifier>EISBN: 9781467303750</identifier><identifier>EISBN: 1467303755</identifier><identifier>DOI: 10.1109/ISSCC.2012.6177098</identifier><language>eng</language><publisher>IEEE</publisher><subject>Delay ; Fabrics ; Quality of service ; Repeaters ; Routing ; Switches ; Very large scale integration</subject><ispartof>2012 IEEE International Solid-State Circuits Conference, 2012, p.478-480</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6177098$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,776,780,785,786,2052,27902,54530,54895,54907</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6177098$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Satpathy, S.</creatorcontrib><creatorcontrib>Sewell, K.</creatorcontrib><creatorcontrib>Manville, T.</creatorcontrib><creatorcontrib>Yen-Po Chen</creatorcontrib><creatorcontrib>Dreslinski, R.</creatorcontrib><creatorcontrib>Sylvester, D.</creatorcontrib><creatorcontrib>Mudge, T.</creatorcontrib><creatorcontrib>Blaauw, D.</creatorcontrib><title>A 4.5Tb/s 3.4Tb/s/W 64×64 switch fabric with self-updating least-recently-granted priority and quality-of-service arbitration in 45nm CMOS</title><title>2012 IEEE International Solid-State Circuits Conference</title><addtitle>ISSCC</addtitle><description>High-speed and low-power routers form the basic building blocks of on-die interconnect fabrics that are critical to overall throughput and energy efficiency of high performance systems. Conventional routers use distinct logic blocks for routing data and handling arbitration. At higher radices, connections between these blocks become a bottleneck, limiting router scalability and degrading performance. Recently, two switch topologies merged the data routing fabric with arbitration control, avoiding this bottleneck. However, relies on centralized control for channel allocation, limiting performance, while restricted to a small set of fixed priorities, rendering input ports prone to starvation. In addition, ever larger CMPs will require continued increases in bandwidth over previous designs. To address these issues, we present a 64x64 single-stage swizzle-switch network (SSN) with 128b data buses (8192 total input/output wires). The SSN can connect any input to any output, including multicast. It has a peak measured throughput of 4.5Tb/s at 1.1V in 45nm SOI CMOS at 25°C. The SSN's key features are: 1) a single-cycle least-recently granted (LRG) priority arbitration technique that reuses the already present input and output data buses and their drivers and sense amps; 2) an additional 4-level message-based priority arbitration for quality of service (QoS) with 2% logic and 3% wiring overhead; 3) a bidirectional bitline repeater that allows the router to scale to &gt;;8000 wires. These features result in a compact fabric (4.06mm2) with throughput gain of 2.1 x over at 3.4Tb/s/W efficiency, which improves to 7.4Tb/s/W at 600mV.</description><subject>Delay</subject><subject>Fabrics</subject><subject>Quality of service</subject><subject>Repeaters</subject><subject>Routing</subject><subject>Switches</subject><subject>Very large scale integration</subject><issn>0193-6530</issn><issn>2376-8606</issn><isbn>1467303763</isbn><isbn>9781467303767</isbn><isbn>9781467303774</isbn><isbn>1467303771</isbn><isbn>9781467303743</isbn><isbn>1467303747</isbn><isbn>9781467303750</isbn><isbn>1467303755</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2012</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNp9j01OwzAYRM2fRAq9AGy-CzixY8dOliiigkXFIpVYVk7itEapU2wXlDNwAA7ExQhS2XY1M3rSkwahO0piSkmRPFdVWcYpoWksqJSkyM_QvJA55UIywqTk5yhKmRQ4F0RcoNk_EOwSRYQWDIuMkWs08_6NEJIVIo_Q1wPwOFvViQcW879MXkHwn2_BwX-a0GyhU7UzDUxjC173HT7sWxWM3UCvlQ_Y6Ubb0I9445QNuoW9M4MzYQRlW3g_qH7qeOiw1-7DNBqUq01wk2KwYCzwzO6gXL5Ut-iqU73X82PeoPvF46p8wkZrvZ6sO-XG9fE8O01_AeRhWHQ</recordid><startdate>201202</startdate><enddate>201202</enddate><creator>Satpathy, S.</creator><creator>Sewell, K.</creator><creator>Manville, T.</creator><creator>Yen-Po Chen</creator><creator>Dreslinski, R.</creator><creator>Sylvester, D.</creator><creator>Mudge, T.</creator><creator>Blaauw, D.</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope></search><sort><creationdate>201202</creationdate><title>A 4.5Tb/s 3.4Tb/s/W 64×64 switch fabric with self-updating least-recently-granted priority and quality-of-service arbitration in 45nm CMOS</title><author>Satpathy, S. ; Sewell, K. ; Manville, T. ; Yen-Po Chen ; Dreslinski, R. ; Sylvester, D. ; Mudge, T. ; Blaauw, D.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-ieee_primary_61770983</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Delay</topic><topic>Fabrics</topic><topic>Quality of service</topic><topic>Repeaters</topic><topic>Routing</topic><topic>Switches</topic><topic>Very large scale integration</topic><toplevel>online_resources</toplevel><creatorcontrib>Satpathy, S.</creatorcontrib><creatorcontrib>Sewell, K.</creatorcontrib><creatorcontrib>Manville, T.</creatorcontrib><creatorcontrib>Yen-Po Chen</creatorcontrib><creatorcontrib>Dreslinski, R.</creatorcontrib><creatorcontrib>Sylvester, D.</creatorcontrib><creatorcontrib>Mudge, T.</creatorcontrib><creatorcontrib>Blaauw, D.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Xplore</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Satpathy, S.</au><au>Sewell, K.</au><au>Manville, T.</au><au>Yen-Po Chen</au><au>Dreslinski, R.</au><au>Sylvester, D.</au><au>Mudge, T.</au><au>Blaauw, D.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>A 4.5Tb/s 3.4Tb/s/W 64×64 switch fabric with self-updating least-recently-granted priority and quality-of-service arbitration in 45nm CMOS</atitle><btitle>2012 IEEE International Solid-State Circuits Conference</btitle><stitle>ISSCC</stitle><date>2012-02</date><risdate>2012</risdate><spage>478</spage><epage>480</epage><pages>478-480</pages><issn>0193-6530</issn><eissn>2376-8606</eissn><isbn>1467303763</isbn><isbn>9781467303767</isbn><eisbn>9781467303774</eisbn><eisbn>1467303771</eisbn><eisbn>9781467303743</eisbn><eisbn>1467303747</eisbn><eisbn>9781467303750</eisbn><eisbn>1467303755</eisbn><abstract>High-speed and low-power routers form the basic building blocks of on-die interconnect fabrics that are critical to overall throughput and energy efficiency of high performance systems. Conventional routers use distinct logic blocks for routing data and handling arbitration. At higher radices, connections between these blocks become a bottleneck, limiting router scalability and degrading performance. Recently, two switch topologies merged the data routing fabric with arbitration control, avoiding this bottleneck. However, relies on centralized control for channel allocation, limiting performance, while restricted to a small set of fixed priorities, rendering input ports prone to starvation. In addition, ever larger CMPs will require continued increases in bandwidth over previous designs. To address these issues, we present a 64x64 single-stage swizzle-switch network (SSN) with 128b data buses (8192 total input/output wires). The SSN can connect any input to any output, including multicast. It has a peak measured throughput of 4.5Tb/s at 1.1V in 45nm SOI CMOS at 25°C. The SSN's key features are: 1) a single-cycle least-recently granted (LRG) priority arbitration technique that reuses the already present input and output data buses and their drivers and sense amps; 2) an additional 4-level message-based priority arbitration for quality of service (QoS) with 2% logic and 3% wiring overhead; 3) a bidirectional bitline repeater that allows the router to scale to &gt;;8000 wires. These features result in a compact fabric (4.06mm2) with throughput gain of 2.1 x over at 3.4Tb/s/W efficiency, which improves to 7.4Tb/s/W at 600mV.</abstract><pub>IEEE</pub><doi>10.1109/ISSCC.2012.6177098</doi></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0193-6530
ispartof 2012 IEEE International Solid-State Circuits Conference, 2012, p.478-480
issn 0193-6530
2376-8606
language eng
recordid cdi_ieee_primary_6177098
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Delay
Fabrics
Quality of service
Repeaters
Routing
Switches
Very large scale integration
title A 4.5Tb/s 3.4Tb/s/W 64×64 switch fabric with self-updating least-recently-granted priority and quality-of-service arbitration in 45nm CMOS
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-01T01%3A12%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=A%204.5Tb/s%203.4Tb/s/W%2064%C3%9764%20switch%20fabric%20with%20self-updating%20least-recently-granted%20priority%20and%20quality-of-service%20arbitration%20in%2045nm%20CMOS&rft.btitle=2012%20IEEE%20International%20Solid-State%20Circuits%20Conference&rft.au=Satpathy,%20S.&rft.date=2012-02&rft.spage=478&rft.epage=480&rft.pages=478-480&rft.issn=0193-6530&rft.eissn=2376-8606&rft.isbn=1467303763&rft.isbn_list=9781467303767&rft_id=info:doi/10.1109/ISSCC.2012.6177098&rft.eisbn=9781467303774&rft.eisbn_list=1467303771&rft.eisbn_list=9781467303743&rft.eisbn_list=1467303747&rft.eisbn_list=9781467303750&rft.eisbn_list=1467303755&rft_dat=%3Cieee_6IE%3E6177098%3C/ieee_6IE%3E%3Cgrp_id%3Ecdi_FETCH-ieee_primary_61770983%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=6177098&rfr_iscdi=true