Loading…
Concurrent algorithm for integrating three-dimensional B-spline functions into machines with shared memory such as GPU
The aim of this paper is to analyze the integration for 3D isogeometric finite element method solvers and its effective scheduling on hierarchical computer architecture. Data necessary for concurrency over elements is independent, so computation on this level is trivially concurrent. However, constr...
Saved in:
Published in: | Computer methods in applied mechanics and engineering 2022-08, Vol.398, p.115201, Article 115201 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c255t-777d2189fcb9bd94600abd8e30e508c74dffe91ffd8774ecb7ffe0ca56ad8f083 |
---|---|
cites | cdi_FETCH-LOGICAL-c255t-777d2189fcb9bd94600abd8e30e508c74dffe91ffd8774ecb7ffe0ca56ad8f083 |
container_end_page | |
container_issue | |
container_start_page | 115201 |
container_title | Computer methods in applied mechanics and engineering |
container_volume | 398 |
creator | Szyszka, Anna Woźniak, Maciej Schaefer, Robert |
description | The aim of this paper is to analyze the integration for 3D isogeometric finite element method solvers and its effective scheduling on hierarchical computer architecture. Data necessary for concurrency over elements is independent, so computation on this level is trivially concurrent. However, constructing several layers of concurrency for the integration algorithm is challenging. In this work, we propose a multilevel concurrent integration algorithm associated with scheduling that brings one extra degree of possible speedup. Because of one extra degree of possible speedup, we analyze the concurrent integration inside elements. The scheduling algorithm is intended for strongly related hierarchical architectures of a GPU. Using trace theory and Foata Normal Form, we verify integrity of the proposed solution. Summing up, we propose a general method for analyzing concurrency of the integration algorithm. We instantiate this method on a classical element-based integration algorithm, however, this methodology is possible to apply for other integration algorithms, including sum factorization, fast numerical quadrature, or row-wise integration methods.
•Multilayer parallel integrating coefficients of IsoGeometric Analysis IGA equations.•Trace theory model used for the formal verification.•Dikert graph based sub-optimal scheduling on cluster with GPU units.•Efficiency and scalability tests of proposed parallel IGA for L2 projection problem.•Application to sum factorization, fast numerical quadrature, row-wise integration. |
doi_str_mv | 10.1016/j.cma.2022.115201 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2708390793</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0045782522003516</els_id><sourcerecordid>2708390793</sourcerecordid><originalsourceid>FETCH-LOGICAL-c255t-777d2189fcb9bd94600abd8e30e508c74dffe91ffd8774ecb7ffe0ca56ad8f083</originalsourceid><addsrcrecordid>eNp9UMFOAyEQJUYTa_UDvJF43gq0W9h40karSRM92DOhMHRpulCB1fTvpVnPzmUyb957mXkI3VIyoYTO73cT3akJI4xNKK0ZoWdoRAVvKkan4hyNCJnVFResvkRXKe1IKUHZCH0vgtd9jOAzVvttiC63HbYhYuczbKPKzm9xbiNAZVwHPrng1R4_Vemwdx6w7b3OBUsnQcCd0m2BE_4pRji1KoLBHXQhHnHqdYtVwsuP9TW6sGqf4Oavj9H65flz8Vqt3pdvi8dVpVld54pzbhgVjdWbZmOa2ZwQtTECpgRqIjSfGWuhodYawfkM9IaXmWhVz5URlojpGN0NvocYvnpIWe5CH8sDSTJe9g3hzbSw6MDSMaQUwcpDdJ2KR0mJPMUrd7LEK0_xyiHeonkYNFDO_3YQZdIOvAbjIugsTXD_qH8BNwSFNA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2708390793</pqid></control><display><type>article</type><title>Concurrent algorithm for integrating three-dimensional B-spline functions into machines with shared memory such as GPU</title><source>ScienceDirect Freedom Collection</source><creator>Szyszka, Anna ; Woźniak, Maciej ; Schaefer, Robert</creator><creatorcontrib>Szyszka, Anna ; Woźniak, Maciej ; Schaefer, Robert</creatorcontrib><description>The aim of this paper is to analyze the integration for 3D isogeometric finite element method solvers and its effective scheduling on hierarchical computer architecture. Data necessary for concurrency over elements is independent, so computation on this level is trivially concurrent. However, constructing several layers of concurrency for the integration algorithm is challenging. In this work, we propose a multilevel concurrent integration algorithm associated with scheduling that brings one extra degree of possible speedup. Because of one extra degree of possible speedup, we analyze the concurrent integration inside elements. The scheduling algorithm is intended for strongly related hierarchical architectures of a GPU. Using trace theory and Foata Normal Form, we verify integrity of the proposed solution. Summing up, we propose a general method for analyzing concurrency of the integration algorithm. We instantiate this method on a classical element-based integration algorithm, however, this methodology is possible to apply for other integration algorithms, including sum factorization, fast numerical quadrature, or row-wise integration methods.
•Multilayer parallel integrating coefficients of IsoGeometric Analysis IGA equations.•Trace theory model used for the formal verification.•Dikert graph based sub-optimal scheduling on cluster with GPU units.•Efficiency and scalability tests of proposed parallel IGA for L2 projection problem.•Application to sum factorization, fast numerical quadrature, row-wise integration.</description><identifier>ISSN: 0045-7825</identifier><identifier>EISSN: 1879-2138</identifier><identifier>DOI: 10.1016/j.cma.2022.115201</identifier><language>eng</language><publisher>Amsterdam: Elsevier B.V</publisher><subject>Algorithms ; B spline functions ; Canonical forms ; Computer architecture ; Concurrency ; Finite element method ; GPGPU ; Isogeometric finite element method ; Mathematical analysis ; Numerical integration ; Quadratures ; Scheduling ; Trace theory</subject><ispartof>Computer methods in applied mechanics and engineering, 2022-08, Vol.398, p.115201, Article 115201</ispartof><rights>2022 Elsevier B.V.</rights><rights>Copyright Elsevier BV Aug 1, 2022</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c255t-777d2189fcb9bd94600abd8e30e508c74dffe91ffd8774ecb7ffe0ca56ad8f083</citedby><cites>FETCH-LOGICAL-c255t-777d2189fcb9bd94600abd8e30e508c74dffe91ffd8774ecb7ffe0ca56ad8f083</cites><orcidid>0000-0002-3179-7863 ; 0000-0002-5576-5671 ; 0000-0002-1669-2086</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27901,27902</link.rule.ids></links><search><creatorcontrib>Szyszka, Anna</creatorcontrib><creatorcontrib>Woźniak, Maciej</creatorcontrib><creatorcontrib>Schaefer, Robert</creatorcontrib><title>Concurrent algorithm for integrating three-dimensional B-spline functions into machines with shared memory such as GPU</title><title>Computer methods in applied mechanics and engineering</title><description>The aim of this paper is to analyze the integration for 3D isogeometric finite element method solvers and its effective scheduling on hierarchical computer architecture. Data necessary for concurrency over elements is independent, so computation on this level is trivially concurrent. However, constructing several layers of concurrency for the integration algorithm is challenging. In this work, we propose a multilevel concurrent integration algorithm associated with scheduling that brings one extra degree of possible speedup. Because of one extra degree of possible speedup, we analyze the concurrent integration inside elements. The scheduling algorithm is intended for strongly related hierarchical architectures of a GPU. Using trace theory and Foata Normal Form, we verify integrity of the proposed solution. Summing up, we propose a general method for analyzing concurrency of the integration algorithm. We instantiate this method on a classical element-based integration algorithm, however, this methodology is possible to apply for other integration algorithms, including sum factorization, fast numerical quadrature, or row-wise integration methods.
•Multilayer parallel integrating coefficients of IsoGeometric Analysis IGA equations.•Trace theory model used for the formal verification.•Dikert graph based sub-optimal scheduling on cluster with GPU units.•Efficiency and scalability tests of proposed parallel IGA for L2 projection problem.•Application to sum factorization, fast numerical quadrature, row-wise integration.</description><subject>Algorithms</subject><subject>B spline functions</subject><subject>Canonical forms</subject><subject>Computer architecture</subject><subject>Concurrency</subject><subject>Finite element method</subject><subject>GPGPU</subject><subject>Isogeometric finite element method</subject><subject>Mathematical analysis</subject><subject>Numerical integration</subject><subject>Quadratures</subject><subject>Scheduling</subject><subject>Trace theory</subject><issn>0045-7825</issn><issn>1879-2138</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNp9UMFOAyEQJUYTa_UDvJF43gq0W9h40karSRM92DOhMHRpulCB1fTvpVnPzmUyb957mXkI3VIyoYTO73cT3akJI4xNKK0ZoWdoRAVvKkan4hyNCJnVFResvkRXKe1IKUHZCH0vgtd9jOAzVvttiC63HbYhYuczbKPKzm9xbiNAZVwHPrng1R4_Vemwdx6w7b3OBUsnQcCd0m2BE_4pRji1KoLBHXQhHnHqdYtVwsuP9TW6sGqf4Oavj9H65flz8Vqt3pdvi8dVpVld54pzbhgVjdWbZmOa2ZwQtTECpgRqIjSfGWuhodYawfkM9IaXmWhVz5URlojpGN0NvocYvnpIWe5CH8sDSTJe9g3hzbSw6MDSMaQUwcpDdJ2KR0mJPMUrd7LEK0_xyiHeonkYNFDO_3YQZdIOvAbjIugsTXD_qH8BNwSFNA</recordid><startdate>20220801</startdate><enddate>20220801</enddate><creator>Szyszka, Anna</creator><creator>Woźniak, Maciej</creator><creator>Schaefer, Robert</creator><general>Elsevier B.V</general><general>Elsevier BV</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7TB</scope><scope>8FD</scope><scope>FR3</scope><scope>JQ2</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-3179-7863</orcidid><orcidid>https://orcid.org/0000-0002-5576-5671</orcidid><orcidid>https://orcid.org/0000-0002-1669-2086</orcidid></search><sort><creationdate>20220801</creationdate><title>Concurrent algorithm for integrating three-dimensional B-spline functions into machines with shared memory such as GPU</title><author>Szyszka, Anna ; Woźniak, Maciej ; Schaefer, Robert</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c255t-777d2189fcb9bd94600abd8e30e508c74dffe91ffd8774ecb7ffe0ca56ad8f083</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Algorithms</topic><topic>B spline functions</topic><topic>Canonical forms</topic><topic>Computer architecture</topic><topic>Concurrency</topic><topic>Finite element method</topic><topic>GPGPU</topic><topic>Isogeometric finite element method</topic><topic>Mathematical analysis</topic><topic>Numerical integration</topic><topic>Quadratures</topic><topic>Scheduling</topic><topic>Trace theory</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Szyszka, Anna</creatorcontrib><creatorcontrib>Woźniak, Maciej</creatorcontrib><creatorcontrib>Schaefer, Robert</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Mechanical & Transportation Engineering Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Computer methods in applied mechanics and engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Szyszka, Anna</au><au>Woźniak, Maciej</au><au>Schaefer, Robert</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Concurrent algorithm for integrating three-dimensional B-spline functions into machines with shared memory such as GPU</atitle><jtitle>Computer methods in applied mechanics and engineering</jtitle><date>2022-08-01</date><risdate>2022</risdate><volume>398</volume><spage>115201</spage><pages>115201-</pages><artnum>115201</artnum><issn>0045-7825</issn><eissn>1879-2138</eissn><abstract>The aim of this paper is to analyze the integration for 3D isogeometric finite element method solvers and its effective scheduling on hierarchical computer architecture. Data necessary for concurrency over elements is independent, so computation on this level is trivially concurrent. However, constructing several layers of concurrency for the integration algorithm is challenging. In this work, we propose a multilevel concurrent integration algorithm associated with scheduling that brings one extra degree of possible speedup. Because of one extra degree of possible speedup, we analyze the concurrent integration inside elements. The scheduling algorithm is intended for strongly related hierarchical architectures of a GPU. Using trace theory and Foata Normal Form, we verify integrity of the proposed solution. Summing up, we propose a general method for analyzing concurrency of the integration algorithm. We instantiate this method on a classical element-based integration algorithm, however, this methodology is possible to apply for other integration algorithms, including sum factorization, fast numerical quadrature, or row-wise integration methods.
•Multilayer parallel integrating coefficients of IsoGeometric Analysis IGA equations.•Trace theory model used for the formal verification.•Dikert graph based sub-optimal scheduling on cluster with GPU units.•Efficiency and scalability tests of proposed parallel IGA for L2 projection problem.•Application to sum factorization, fast numerical quadrature, row-wise integration.</abstract><cop>Amsterdam</cop><pub>Elsevier B.V</pub><doi>10.1016/j.cma.2022.115201</doi><orcidid>https://orcid.org/0000-0002-3179-7863</orcidid><orcidid>https://orcid.org/0000-0002-5576-5671</orcidid><orcidid>https://orcid.org/0000-0002-1669-2086</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0045-7825 |
ispartof | Computer methods in applied mechanics and engineering, 2022-08, Vol.398, p.115201, Article 115201 |
issn | 0045-7825 1879-2138 |
language | eng |
recordid | cdi_proquest_journals_2708390793 |
source | ScienceDirect Freedom Collection |
subjects | Algorithms B spline functions Canonical forms Computer architecture Concurrency Finite element method GPGPU Isogeometric finite element method Mathematical analysis Numerical integration Quadratures Scheduling Trace theory |
title | Concurrent algorithm for integrating three-dimensional B-spline functions into machines with shared memory such as GPU |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-24T01%3A57%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Concurrent%20algorithm%20for%20integrating%20three-dimensional%20B-spline%20functions%20into%20machines%20with%20shared%20memory%20such%20as%20GPU&rft.jtitle=Computer%20methods%20in%20applied%20mechanics%20and%20engineering&rft.au=Szyszka,%20Anna&rft.date=2022-08-01&rft.volume=398&rft.spage=115201&rft.pages=115201-&rft.artnum=115201&rft.issn=0045-7825&rft.eissn=1879-2138&rft_id=info:doi/10.1016/j.cma.2022.115201&rft_dat=%3Cproquest_cross%3E2708390793%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c255t-777d2189fcb9bd94600abd8e30e508c74dffe91ffd8774ecb7ffe0ca56ad8f083%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2708390793&rft_id=info:pmid/&rfr_iscdi=true |