Loading…
On the High-Performance Computing of Layered Green's Function Based on the Graphics Processing Unit
The efficient evaluation of Sommerfeld integrals (SIs) in planar layered media has been a long-term bottleneck in the accurate electromagnetic analysis of modern radio frequency (RF) circuits, chips, and devices. This work investigates the high-performance computing of SIs using modern graphics proc...
Saved in:
Published in: | IEEE transactions on antennas and propagation 2024-06, Vol.72 (6), p.5159-5170 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | cdi_FETCH-LOGICAL-c175t-8baabef70ca12b1ef55c1206b0f35637bf753dff1ee2e30d125ff1f64acf71c03 |
container_end_page | 5170 |
container_issue | 6 |
container_start_page | 5159 |
container_title | IEEE transactions on antennas and propagation |
container_volume | 72 |
creator | Wu, Bi-Yi Yan, Chao-Ze Yuan, Xin Zhang, Qianyun He, Wei-Jia Sheng, Xin-Qing |
description | The efficient evaluation of Sommerfeld integrals (SIs) in planar layered media has been a long-term bottleneck in the accurate electromagnetic analysis of modern radio frequency (RF) circuits, chips, and devices. This work investigates the high-performance computing of SIs using modern graphics processing units (GPUs) to alleviate this difficulty. Based on the numerical integration procedure with controllable accuracy for SIs, the GPU parallel schemes for SI heads and SI tails are first presented. By eliminating the redundant calculations in SI of multiple frequencies, highly efficient parallel computing enhanced by tensor cores of GPU is developed, and it evaluates the multiple frequency SIs simultaneously. In addition, the mixed-precision computing that further accelerates computing is also studied and tested. Extensively numerical experiments are carried out on two commercial gaming GPUs and verify the performance of the proposed parallel scheme. It achieves a dozen to hundreds speedup compared to that using two high-end CPUs with full OpenMP parallelization. |
doi_str_mv | 10.1109/TAP.2024.3387835 |
format | article |
fullrecord | <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_proquest_journals_3065466670</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10504761</ieee_id><sourcerecordid>3065466670</sourcerecordid><originalsourceid>FETCH-LOGICAL-c175t-8baabef70ca12b1ef55c1206b0f35637bf753dff1ee2e30d125ff1f64acf71c03</originalsourceid><addsrcrecordid>eNpNkL1rAjEYxkNpodZ279Ah0KHT2XxcknO0UrUg6KDQLeTiGz2pyTW5G_zve3IOnV6el-cDfgg9UzKilIzfN5P1iBGWjzgvVMHFDRpQIYqMMUZv0YAQWmRjJr_v0UNKx07mRZ4PkF153BwAL6r9IVtDdCGejLeAp-FUt03l9zg4vDRniLDD8wjg3xKetd42VfD4w6TuHfqOeTT1obIJr2OwkNIlvPVV84junPlJ8HS9Q7SdfW6mi2y5mn9NJ8vMUiWarCiNKcEpYg1lJQUnhKWMyJI4LiRXpVOC75yjAAw42VEmOuFkbqxT1BI-RK99bx3Dbwup0cfQRt9Nak6kyKWU6uIivcvGkFIEp-tYnUw8a0r0BaXuUOoLSn1F2UVe-kgFAP_sguRKUv4H5ttwKQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3065466670</pqid></control><display><type>article</type><title>On the High-Performance Computing of Layered Green's Function Based on the Graphics Processing Unit</title><source>IEEE Xplore (Online service)</source><creator>Wu, Bi-Yi ; Yan, Chao-Ze ; Yuan, Xin ; Zhang, Qianyun ; He, Wei-Jia ; Sheng, Xin-Qing</creator><creatorcontrib>Wu, Bi-Yi ; Yan, Chao-Ze ; Yuan, Xin ; Zhang, Qianyun ; He, Wei-Jia ; Sheng, Xin-Qing</creatorcontrib><description>The efficient evaluation of Sommerfeld integrals (SIs) in planar layered media has been a long-term bottleneck in the accurate electromagnetic analysis of modern radio frequency (RF) circuits, chips, and devices. This work investigates the high-performance computing of SIs using modern graphics processing units (GPUs) to alleviate this difficulty. Based on the numerical integration procedure with controllable accuracy for SIs, the GPU parallel schemes for SI heads and SI tails are first presented. By eliminating the redundant calculations in SI of multiple frequencies, highly efficient parallel computing enhanced by tensor cores of GPU is developed, and it evaluates the multiple frequency SIs simultaneously. In addition, the mixed-precision computing that further accelerates computing is also studied and tested. Extensively numerical experiments are carried out on two commercial gaming GPUs and verify the performance of the proposed parallel scheme. It achieves a dozen to hundreds speedup compared to that using two high-end CPUs with full OpenMP parallelization.</description><identifier>ISSN: 0018-926X</identifier><identifier>EISSN: 1558-2221</identifier><identifier>DOI: 10.1109/TAP.2024.3387835</identifier><identifier>CODEN: IETPAK</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Controllability ; Graphics processing unit (GPU) ; Graphics processing units ; Green's function methods ; Green's functions ; Green’s function ; Head ; High performance computing ; Instruction sets ; integral equation ; mixed-precision ; Numerical integration ; Parallel processing ; Silicon ; Sommerfeld integrals (SIs) ; Tail ; tensor cores ; Tensors</subject><ispartof>IEEE transactions on antennas and propagation, 2024-06, Vol.72 (6), p.5159-5170</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c175t-8baabef70ca12b1ef55c1206b0f35637bf753dff1ee2e30d125ff1f64acf71c03</cites><orcidid>0000-0001-7790-3839 ; 0000-0002-4614-6545 ; 0009-0001-9559-4109 ; 0000-0002-8136-3705 ; 0000-0002-2147-4059</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10504761$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids></links><search><creatorcontrib>Wu, Bi-Yi</creatorcontrib><creatorcontrib>Yan, Chao-Ze</creatorcontrib><creatorcontrib>Yuan, Xin</creatorcontrib><creatorcontrib>Zhang, Qianyun</creatorcontrib><creatorcontrib>He, Wei-Jia</creatorcontrib><creatorcontrib>Sheng, Xin-Qing</creatorcontrib><title>On the High-Performance Computing of Layered Green's Function Based on the Graphics Processing Unit</title><title>IEEE transactions on antennas and propagation</title><addtitle>TAP</addtitle><description>The efficient evaluation of Sommerfeld integrals (SIs) in planar layered media has been a long-term bottleneck in the accurate electromagnetic analysis of modern radio frequency (RF) circuits, chips, and devices. This work investigates the high-performance computing of SIs using modern graphics processing units (GPUs) to alleviate this difficulty. Based on the numerical integration procedure with controllable accuracy for SIs, the GPU parallel schemes for SI heads and SI tails are first presented. By eliminating the redundant calculations in SI of multiple frequencies, highly efficient parallel computing enhanced by tensor cores of GPU is developed, and it evaluates the multiple frequency SIs simultaneously. In addition, the mixed-precision computing that further accelerates computing is also studied and tested. Extensively numerical experiments are carried out on two commercial gaming GPUs and verify the performance of the proposed parallel scheme. It achieves a dozen to hundreds speedup compared to that using two high-end CPUs with full OpenMP parallelization.</description><subject>Controllability</subject><subject>Graphics processing unit (GPU)</subject><subject>Graphics processing units</subject><subject>Green's function methods</subject><subject>Green's functions</subject><subject>Green’s function</subject><subject>Head</subject><subject>High performance computing</subject><subject>Instruction sets</subject><subject>integral equation</subject><subject>mixed-precision</subject><subject>Numerical integration</subject><subject>Parallel processing</subject><subject>Silicon</subject><subject>Sommerfeld integrals (SIs)</subject><subject>Tail</subject><subject>tensor cores</subject><subject>Tensors</subject><issn>0018-926X</issn><issn>1558-2221</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNpNkL1rAjEYxkNpodZ279Ah0KHT2XxcknO0UrUg6KDQLeTiGz2pyTW5G_zve3IOnV6el-cDfgg9UzKilIzfN5P1iBGWjzgvVMHFDRpQIYqMMUZv0YAQWmRjJr_v0UNKx07mRZ4PkF153BwAL6r9IVtDdCGejLeAp-FUt03l9zg4vDRniLDD8wjg3xKetd42VfD4w6TuHfqOeTT1obIJr2OwkNIlvPVV84junPlJ8HS9Q7SdfW6mi2y5mn9NJ8vMUiWarCiNKcEpYg1lJQUnhKWMyJI4LiRXpVOC75yjAAw42VEmOuFkbqxT1BI-RK99bx3Dbwup0cfQRt9Nak6kyKWU6uIivcvGkFIEp-tYnUw8a0r0BaXuUOoLSn1F2UVe-kgFAP_sguRKUv4H5ttwKQ</recordid><startdate>20240601</startdate><enddate>20240601</enddate><creator>Wu, Bi-Yi</creator><creator>Yan, Chao-Ze</creator><creator>Yuan, Xin</creator><creator>Zhang, Qianyun</creator><creator>He, Wei-Jia</creator><creator>Sheng, Xin-Qing</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0001-7790-3839</orcidid><orcidid>https://orcid.org/0000-0002-4614-6545</orcidid><orcidid>https://orcid.org/0009-0001-9559-4109</orcidid><orcidid>https://orcid.org/0000-0002-8136-3705</orcidid><orcidid>https://orcid.org/0000-0002-2147-4059</orcidid></search><sort><creationdate>20240601</creationdate><title>On the High-Performance Computing of Layered Green's Function Based on the Graphics Processing Unit</title><author>Wu, Bi-Yi ; Yan, Chao-Ze ; Yuan, Xin ; Zhang, Qianyun ; He, Wei-Jia ; Sheng, Xin-Qing</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c175t-8baabef70ca12b1ef55c1206b0f35637bf753dff1ee2e30d125ff1f64acf71c03</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Controllability</topic><topic>Graphics processing unit (GPU)</topic><topic>Graphics processing units</topic><topic>Green's function methods</topic><topic>Green's functions</topic><topic>Green’s function</topic><topic>Head</topic><topic>High performance computing</topic><topic>Instruction sets</topic><topic>integral equation</topic><topic>mixed-precision</topic><topic>Numerical integration</topic><topic>Parallel processing</topic><topic>Silicon</topic><topic>Sommerfeld integrals (SIs)</topic><topic>Tail</topic><topic>tensor cores</topic><topic>Tensors</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wu, Bi-Yi</creatorcontrib><creatorcontrib>Yan, Chao-Ze</creatorcontrib><creatorcontrib>Yuan, Xin</creatorcontrib><creatorcontrib>Zhang, Qianyun</creatorcontrib><creatorcontrib>He, Wei-Jia</creatorcontrib><creatorcontrib>Sheng, Xin-Qing</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005–Present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998–Present</collection><collection>IEEE Electronic Library Online</collection><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on antennas and propagation</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wu, Bi-Yi</au><au>Yan, Chao-Ze</au><au>Yuan, Xin</au><au>Zhang, Qianyun</au><au>He, Wei-Jia</au><au>Sheng, Xin-Qing</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>On the High-Performance Computing of Layered Green's Function Based on the Graphics Processing Unit</atitle><jtitle>IEEE transactions on antennas and propagation</jtitle><stitle>TAP</stitle><date>2024-06-01</date><risdate>2024</risdate><volume>72</volume><issue>6</issue><spage>5159</spage><epage>5170</epage><pages>5159-5170</pages><issn>0018-926X</issn><eissn>1558-2221</eissn><coden>IETPAK</coden><abstract>The efficient evaluation of Sommerfeld integrals (SIs) in planar layered media has been a long-term bottleneck in the accurate electromagnetic analysis of modern radio frequency (RF) circuits, chips, and devices. This work investigates the high-performance computing of SIs using modern graphics processing units (GPUs) to alleviate this difficulty. Based on the numerical integration procedure with controllable accuracy for SIs, the GPU parallel schemes for SI heads and SI tails are first presented. By eliminating the redundant calculations in SI of multiple frequencies, highly efficient parallel computing enhanced by tensor cores of GPU is developed, and it evaluates the multiple frequency SIs simultaneously. In addition, the mixed-precision computing that further accelerates computing is also studied and tested. Extensively numerical experiments are carried out on two commercial gaming GPUs and verify the performance of the proposed parallel scheme. It achieves a dozen to hundreds speedup compared to that using two high-end CPUs with full OpenMP parallelization.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TAP.2024.3387835</doi><tpages>12</tpages><orcidid>https://orcid.org/0000-0001-7790-3839</orcidid><orcidid>https://orcid.org/0000-0002-4614-6545</orcidid><orcidid>https://orcid.org/0009-0001-9559-4109</orcidid><orcidid>https://orcid.org/0000-0002-8136-3705</orcidid><orcidid>https://orcid.org/0000-0002-2147-4059</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0018-926X |
ispartof | IEEE transactions on antennas and propagation, 2024-06, Vol.72 (6), p.5159-5170 |
issn | 0018-926X 1558-2221 |
language | eng |
recordid | cdi_proquest_journals_3065466670 |
source | IEEE Xplore (Online service) |
subjects | Controllability Graphics processing unit (GPU) Graphics processing units Green's function methods Green's functions Green’s function Head High performance computing Instruction sets integral equation mixed-precision Numerical integration Parallel processing Silicon Sommerfeld integrals (SIs) Tail tensor cores Tensors |
title | On the High-Performance Computing of Layered Green's Function Based on the Graphics Processing Unit |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-24T17%3A07%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=On%20the%20High-Performance%20Computing%20of%20Layered%20Green's%20Function%20Based%20on%20the%20Graphics%20Processing%20Unit&rft.jtitle=IEEE%20transactions%20on%20antennas%20and%20propagation&rft.au=Wu,%20Bi-Yi&rft.date=2024-06-01&rft.volume=72&rft.issue=6&rft.spage=5159&rft.epage=5170&rft.pages=5159-5170&rft.issn=0018-926X&rft.eissn=1558-2221&rft.coden=IETPAK&rft_id=info:doi/10.1109/TAP.2024.3387835&rft_dat=%3Cproquest_ieee_%3E3065466670%3C/proquest_ieee_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c175t-8baabef70ca12b1ef55c1206b0f35637bf753dff1ee2e30d125ff1f64acf71c03%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3065466670&rft_id=info:pmid/&rft_ieee_id=10504761&rfr_iscdi=true |