Loading…
High Performance Communication on Reconfigurable Clusters
FPGA clusters with the FPGAs directly linked through their Multi-Gigabit Transceivers (MGT) have a proven advantage over other commodity architectures for communication-bound applications. To date, however, communication infrastructure for such clusters has generally taken one of two approaches: nea...
Saved in:
Main Authors: | , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | 2194 |
container_issue | |
container_start_page | 219 |
container_title | |
container_volume | |
creator | Sheng, Jiayi Yang, Chen Herbordt, Martin C. |
description | FPGA clusters with the FPGAs directly linked through their Multi-Gigabit Transceivers (MGT) have a proven advantage over other commodity architectures for communication-bound applications. To date, however, communication infrastructure for such clusters has generally taken one of two approaches: nearest neighbor only, which is fast but has limited utility, and processor-based, which is general, but relatively slow. What is needed is for communication microarchitecture of these systems to be systematically explored, as has been done for HPC clusters and for Networks on Chip (NoC) on both FPGAs and ASICs. Our first contribution is finding that the properties of clusters of tightly coupled FPGAs substantially influence the router design space. We create a candidate router and generalize it so that it is parameterized by routing algorithm, arbitration policy, and virtual channels (VC). We have created a cycle-accurate simulator validated on a four-FPGA system. We evaluate the design space with respect to a number of standard communication patterns and packet sizes. These results enable selection of the appropriate router for any resource budget. We find that the optimality of the router design varies significantly with workloads. We present a framework that helps to determine appropriate parameters based on different applications and generate the HDL design. We observe that for a 512 FPGA cluster, compared with the router configuration with the best average performance, application-aware router selection can lead to substantial improvement in performance or reduction in area. |
doi_str_mv | 10.1109/FPL.2018.00044 |
format | conference_proceeding |
fullrecord | <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_8533497</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8533497</ieee_id><sourcerecordid>8533497</sourcerecordid><originalsourceid>FETCH-LOGICAL-i241t-61a4a769de13aeb56e8df12698d3276b24d038e47c072e51e569b5e826dc50e93</originalsourceid><addsrcrecordid>eNotjEtLw0AURkehYKnZunGTP5A4d553lhKsLQQsYtdlktzUkTxkkiz89wYUPjiLc_gYewCeA3D3tD-VueCAOedcqRuWOIugJRrUYNUt24JTJgOFeMeSafpaM66VRW22zB3C9TM9UWzH2PuhprQY-34ZQu3nMA7puneqx6EN1yX6qlt9t0wzxemebVrfTZT8c8fO-5eP4pCVb6_H4rnMglAwZwa88ta4hkB6qrQhbFoQxmEjhTWVUA2XSMrW3ArSQNq4ShMK09Sak5M79vj3G4jo8h1D7-PPBbWUyln5C45KRtg</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>High Performance Communication on Reconfigurable Clusters</title><source>IEEE Xplore All Conference Series</source><creator>Sheng, Jiayi ; Yang, Chen ; Herbordt, Martin C.</creator><creatorcontrib>Sheng, Jiayi ; Yang, Chen ; Herbordt, Martin C.</creatorcontrib><description>FPGA clusters with the FPGAs directly linked through their Multi-Gigabit Transceivers (MGT) have a proven advantage over other commodity architectures for communication-bound applications. To date, however, communication infrastructure for such clusters has generally taken one of two approaches: nearest neighbor only, which is fast but has limited utility, and processor-based, which is general, but relatively slow. What is needed is for communication microarchitecture of these systems to be systematically explored, as has been done for HPC clusters and for Networks on Chip (NoC) on both FPGAs and ASICs. Our first contribution is finding that the properties of clusters of tightly coupled FPGAs substantially influence the router design space. We create a candidate router and generalize it so that it is parameterized by routing algorithm, arbitration policy, and virtual channels (VC). We have created a cycle-accurate simulator validated on a four-FPGA system. We evaluate the design space with respect to a number of standard communication patterns and packet sizes. These results enable selection of the appropriate router for any resource budget. We find that the optimality of the router design varies significantly with workloads. We present a framework that helps to determine appropriate parameters based on different applications and generate the HDL design. We observe that for a 512 FPGA cluster, compared with the router configuration with the best average performance, application-aware router selection can lead to substantial improvement in performance or reduction in area.</description><identifier>EISSN: 1946-1488</identifier><identifier>EISBN: 9781538685174</identifier><identifier>EISBN: 1538685175</identifier><identifier>DOI: 10.1109/FPL.2018.00044</identifier><identifier>CODEN: IEEPAD</identifier><language>eng</language><publisher>IEEE</publisher><subject>Field programmable gate arrays ; FPGA Cluster Communication ; High-Performance Computing ; Microarchitecture ; Payloads ; Router Microarchitecture ; Routing ; Switches ; System recovery</subject><ispartof>2018 28th International Conference on Field Programmable Logic and Applications (FPL), 2018, p.219-2194</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8533497$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,27925,54555,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8533497$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Sheng, Jiayi</creatorcontrib><creatorcontrib>Yang, Chen</creatorcontrib><creatorcontrib>Herbordt, Martin C.</creatorcontrib><title>High Performance Communication on Reconfigurable Clusters</title><title>2018 28th International Conference on Field Programmable Logic and Applications (FPL)</title><addtitle>FPL</addtitle><description>FPGA clusters with the FPGAs directly linked through their Multi-Gigabit Transceivers (MGT) have a proven advantage over other commodity architectures for communication-bound applications. To date, however, communication infrastructure for such clusters has generally taken one of two approaches: nearest neighbor only, which is fast but has limited utility, and processor-based, which is general, but relatively slow. What is needed is for communication microarchitecture of these systems to be systematically explored, as has been done for HPC clusters and for Networks on Chip (NoC) on both FPGAs and ASICs. Our first contribution is finding that the properties of clusters of tightly coupled FPGAs substantially influence the router design space. We create a candidate router and generalize it so that it is parameterized by routing algorithm, arbitration policy, and virtual channels (VC). We have created a cycle-accurate simulator validated on a four-FPGA system. We evaluate the design space with respect to a number of standard communication patterns and packet sizes. These results enable selection of the appropriate router for any resource budget. We find that the optimality of the router design varies significantly with workloads. We present a framework that helps to determine appropriate parameters based on different applications and generate the HDL design. We observe that for a 512 FPGA cluster, compared with the router configuration with the best average performance, application-aware router selection can lead to substantial improvement in performance or reduction in area.</description><subject>Field programmable gate arrays</subject><subject>FPGA Cluster Communication</subject><subject>High-Performance Computing</subject><subject>Microarchitecture</subject><subject>Payloads</subject><subject>Router Microarchitecture</subject><subject>Routing</subject><subject>Switches</subject><subject>System recovery</subject><issn>1946-1488</issn><isbn>9781538685174</isbn><isbn>1538685175</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2018</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotjEtLw0AURkehYKnZunGTP5A4d553lhKsLQQsYtdlktzUkTxkkiz89wYUPjiLc_gYewCeA3D3tD-VueCAOedcqRuWOIugJRrUYNUt24JTJgOFeMeSafpaM66VRW22zB3C9TM9UWzH2PuhprQY-34ZQu3nMA7puneqx6EN1yX6qlt9t0wzxemebVrfTZT8c8fO-5eP4pCVb6_H4rnMglAwZwa88ta4hkB6qrQhbFoQxmEjhTWVUA2XSMrW3ArSQNq4ShMK09Sak5M79vj3G4jo8h1D7-PPBbWUyln5C45KRtg</recordid><startdate>201808</startdate><enddate>201808</enddate><creator>Sheng, Jiayi</creator><creator>Yang, Chen</creator><creator>Herbordt, Martin C.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>201808</creationdate><title>High Performance Communication on Reconfigurable Clusters</title><author>Sheng, Jiayi ; Yang, Chen ; Herbordt, Martin C.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i241t-61a4a769de13aeb56e8df12698d3276b24d038e47c072e51e569b5e826dc50e93</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Field programmable gate arrays</topic><topic>FPGA Cluster Communication</topic><topic>High-Performance Computing</topic><topic>Microarchitecture</topic><topic>Payloads</topic><topic>Router Microarchitecture</topic><topic>Routing</topic><topic>Switches</topic><topic>System recovery</topic><toplevel>online_resources</toplevel><creatorcontrib>Sheng, Jiayi</creatorcontrib><creatorcontrib>Yang, Chen</creatorcontrib><creatorcontrib>Herbordt, Martin C.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE/IET Electronic Library</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Sheng, Jiayi</au><au>Yang, Chen</au><au>Herbordt, Martin C.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>High Performance Communication on Reconfigurable Clusters</atitle><btitle>2018 28th International Conference on Field Programmable Logic and Applications (FPL)</btitle><stitle>FPL</stitle><date>2018-08</date><risdate>2018</risdate><spage>219</spage><epage>2194</epage><pages>219-2194</pages><eissn>1946-1488</eissn><eisbn>9781538685174</eisbn><eisbn>1538685175</eisbn><coden>IEEPAD</coden><abstract>FPGA clusters with the FPGAs directly linked through their Multi-Gigabit Transceivers (MGT) have a proven advantage over other commodity architectures for communication-bound applications. To date, however, communication infrastructure for such clusters has generally taken one of two approaches: nearest neighbor only, which is fast but has limited utility, and processor-based, which is general, but relatively slow. What is needed is for communication microarchitecture of these systems to be systematically explored, as has been done for HPC clusters and for Networks on Chip (NoC) on both FPGAs and ASICs. Our first contribution is finding that the properties of clusters of tightly coupled FPGAs substantially influence the router design space. We create a candidate router and generalize it so that it is parameterized by routing algorithm, arbitration policy, and virtual channels (VC). We have created a cycle-accurate simulator validated on a four-FPGA system. We evaluate the design space with respect to a number of standard communication patterns and packet sizes. These results enable selection of the appropriate router for any resource budget. We find that the optimality of the router design varies significantly with workloads. We present a framework that helps to determine appropriate parameters based on different applications and generate the HDL design. We observe that for a 512 FPGA cluster, compared with the router configuration with the best average performance, application-aware router selection can lead to substantial improvement in performance or reduction in area.</abstract><pub>IEEE</pub><doi>10.1109/FPL.2018.00044</doi><tpages>1976</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | EISSN: 1946-1488 |
ispartof | 2018 28th International Conference on Field Programmable Logic and Applications (FPL), 2018, p.219-2194 |
issn | 1946-1488 |
language | eng |
recordid | cdi_ieee_primary_8533497 |
source | IEEE Xplore All Conference Series |
subjects | Field programmable gate arrays FPGA Cluster Communication High-Performance Computing Microarchitecture Payloads Router Microarchitecture Routing Switches System recovery |
title | High Performance Communication on Reconfigurable Clusters |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-19T10%3A08%3A19IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=High%20Performance%20Communication%20on%20Reconfigurable%20Clusters&rft.btitle=2018%2028th%20International%20Conference%20on%20Field%20Programmable%20Logic%20and%20Applications%20(FPL)&rft.au=Sheng,%20Jiayi&rft.date=2018-08&rft.spage=219&rft.epage=2194&rft.pages=219-2194&rft.eissn=1946-1488&rft.coden=IEEPAD&rft_id=info:doi/10.1109/FPL.2018.00044&rft.eisbn=9781538685174&rft.eisbn_list=1538685175&rft_dat=%3Cieee_CHZPO%3E8533497%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i241t-61a4a769de13aeb56e8df12698d3276b24d038e47c072e51e569b5e826dc50e93%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=8533497&rfr_iscdi=true |