Loading…

Concurrent algorithm for integrating three-dimensional B-spline functions into machines with shared memory such as GPU

The aim of this paper is to analyze the integration for 3D isogeometric finite element method solvers and its effective scheduling on hierarchical computer architecture. Data necessary for concurrency over elements is independent, so computation on this level is trivially concurrent. However, constr...

Full description

Saved in:

Bibliographic Details
Published in:	Computer methods in applied mechanics and engineering 2022-08, Vol.398, p.115201, Article 115201
Main Authors:	Szyszka, Anna, Woźniak, Maciej, Schaefer, Robert
Format:	Article
Language:	English
Subjects:	Algorithms B spline functions Canonical forms Computer architecture Concurrency Finite element method GPGPU Isogeometric finite element method Mathematical analysis Numerical integration Quadratures Scheduling Trace theory
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by	cdi_FETCH-LOGICAL-c255t-777d2189fcb9bd94600abd8e30e508c74dffe91ffd8774ecb7ffe0ca56ad8f083
cites	cdi_FETCH-LOGICAL-c255t-777d2189fcb9bd94600abd8e30e508c74dffe91ffd8774ecb7ffe0ca56ad8f083
container_end_page
container_issue
container_start_page	115201
container_title	Computer methods in applied mechanics and engineering
container_volume	398
creator	Szyszka, Anna Woźniak, Maciej Schaefer, Robert
description	The aim of this paper is to analyze the integration for 3D isogeometric finite element method solvers and its effective scheduling on hierarchical computer architecture. Data necessary for concurrency over elements is independent, so computation on this level is trivially concurrent. However, constructing several layers of concurrency for the integration algorithm is challenging. In this work, we propose a multilevel concurrent integration algorithm associated with scheduling that brings one extra degree of possible speedup. Because of one extra degree of possible speedup, we analyze the concurrent integration inside elements. The scheduling algorithm is intended for strongly related hierarchical architectures of a GPU. Using trace theory and Foata Normal Form, we verify integrity of the proposed solution. Summing up, we propose a general method for analyzing concurrency of the integration algorithm. We instantiate this method on a classical element-based integration algorithm, however, this methodology is possible to apply for other integration algorithms, including sum factorization, fast numerical quadrature, or row-wise integration methods. •Multilayer parallel integrating coefficients of IsoGeometric Analysis IGA equations.•Trace theory model used for the formal verification.•Dikert graph based sub-optimal scheduling on cluster with GPU units.•Efficiency and scalability tests of proposed parallel IGA for L2 projection problem.•Application to sum factorization, fast numerical quadrature, row-wise integration.
doi_str_mv	10.1016/j.cma.2022.115201
format	article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2708390793</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0045782522003516</els_id><sourcerecordid>2708390793</sourcerecordid><originalsourceid>FETCH-LOGICAL-c255t-777d2189fcb9bd94600abd8e30e508c74dffe91ffd8774ecb7ffe0ca56ad8f083</originalsourceid><addsrcrecordid>eNp9UMFOAyEQJUYTa_UDvJF43gq0W9h40karSRM92DOhMHRpulCB1fTvpVnPzmUyb957mXkI3VIyoYTO73cT3akJI4xNKK0ZoWdoRAVvKkan4hyNCJnVFResvkRXKe1IKUHZCH0vgtd9jOAzVvttiC63HbYhYuczbKPKzm9xbiNAZVwHPrng1R4_Vemwdx6w7b3OBUsnQcCd0m2BE_4pRji1KoLBHXQhHnHqdYtVwsuP9TW6sGqf4Oavj9H65flz8Vqt3pdvi8dVpVld54pzbhgVjdWbZmOa2ZwQtTECpgRqIjSfGWuhodYawfkM9IaXmWhVz5URlojpGN0NvocYvnpIWe5CH8sDSTJe9g3hzbSw6MDSMaQUwcpDdJ2KR0mJPMUrd7LEK0_xyiHeonkYNFDO_3YQZdIOvAbjIugsTXD_qH8BNwSFNA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2708390793</pqid></control><display><type>article</type><title>Concurrent algorithm for integrating three-dimensional B-spline functions into machines with shared memory such as GPU</title><source>ScienceDirect Freedom Collection</source><creator>Szyszka, Anna ; Woźniak, Maciej ; Schaefer, Robert</creator><creatorcontrib>Szyszka, Anna ; Woźniak, Maciej ; Schaefer, Robert</creatorcontrib><description>The aim of this paper is to analyze the integration for 3D isogeometric finite element method solvers and its effective scheduling on hierarchical computer architecture. Data necessary for concurrency over elements is independent, so computation on this level is trivially concurrent. However, constructing several layers of concurrency for the integration algorithm is challenging. In this work, we propose a multilevel concurrent integration algorithm associated with scheduling that brings one extra degree of possible speedup. Because of one extra degree of possible speedup, we analyze the concurrent integration inside elements. The scheduling algorithm is intended for strongly related hierarchical architectures of a GPU. Using trace theory and Foata Normal Form, we verify integrity of the proposed solution. Summing up, we propose a general method for analyzing concurrency of the integration algorithm. We instantiate this method on a classical element-based integration algorithm, however, this methodology is possible to apply for other integration algorithms, including sum factorization, fast numerical quadrature, or row-wise integration methods. •Multilayer parallel integrating coefficients of IsoGeometric Analysis IGA equations.•Trace theory model used for the formal verification.•Dikert graph based sub-optimal scheduling on cluster with GPU units.•Efficiency and scalability tests of proposed parallel IGA for L2 projection problem.•Application to sum factorization, fast numerical quadrature, row-wise integration.</description><identifier>ISSN: 0045-7825</identifier><identifier>EISSN: 1879-2138</identifier><identifier>DOI: 10.1016/j.cma.2022.115201</identifier><language>eng</language><publisher>Amsterdam: Elsevier B.V</publisher><subject>Algorithms ; B spline functions ; Canonical forms ; Computer architecture ; Concurrency ; Finite element method ; GPGPU ; Isogeometric finite element method ; Mathematical analysis ; Numerical integration ; Quadratures ; Scheduling ; Trace theory</subject><ispartof>Computer methods in applied mechanics and engineering, 2022-08, Vol.398, p.115201, Article 115201</ispartof><rights>2022 Elsevier B.V.</rights><rights>Copyright Elsevier BV Aug 1, 2022</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c255t-777d2189fcb9bd94600abd8e30e508c74dffe91ffd8774ecb7ffe0ca56ad8f083</citedby><cites>FETCH-LOGICAL-c255t-777d2189fcb9bd94600abd8e30e508c74dffe91ffd8774ecb7ffe0ca56ad8f083</cites><orcidid>0000-0002-3179-7863 ; 0000-0002-5576-5671 ; 0000-0002-1669-2086</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27901,27902</link.rule.ids></links><search><creatorcontrib>Szyszka, Anna</creatorcontrib><creatorcontrib>Woźniak, Maciej</creatorcontrib><creatorcontrib>Schaefer, Robert</creatorcontrib><title>Concurrent algorithm for integrating three-dimensional B-spline functions into machines with shared memory such as GPU</title><title>Computer methods in applied mechanics and engineering</title><description>The aim of this paper is to analyze the integration for 3D isogeometric finite element method solvers and its effective scheduling on hierarchical computer architecture. Data necessary for concurrency over elements is independent, so computation on this level is trivially concurrent. However, constructing several layers of concurrency for the integration algorithm is challenging. In this work, we propose a multilevel concurrent integration algorithm associated with scheduling that brings one extra degree of possible speedup. Because of one extra degree of possible speedup, we analyze the concurrent integration inside elements. The scheduling algorithm is intended for strongly related hierarchical architectures of a GPU. Using trace theory and Foata Normal Form, we verify integrity of the proposed solution. Summing up, we propose a general method for analyzing concurrency of the integration algorithm. We instantiate this method on a classical element-based integration algorithm, however, this methodology is possible to apply for other integration algorithms, including sum factorization, fast numerical quadrature, or row-wise integration methods. •Multilayer parallel integrating coefficients of IsoGeometric Analysis IGA equations.•Trace theory model used for the formal verification.•Dikert graph based sub-optimal scheduling on cluster with GPU units.•Efficiency and scalability tests of proposed parallel IGA for L2 projection problem.•Application to sum factorization, fast numerical quadrature, row-wise integration.</description><subject>Algorithms</subject><subject>B spline functions</subject><subject>Canonical forms</subject><subject>Computer architecture</subject><subject>Concurrency</subject><subject>Finite element method</subject><subject>GPGPU</subject><subject>Isogeometric finite element method</subject><subject>Mathematical analysis</subject><subject>Numerical integration</subject><subject>Quadratures</subject><subject>Scheduling</subject><subject>Trace theory</subject><issn>0045-7825</issn><issn>1879-2138</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNp9UMFOAyEQJUYTa_UDvJF43gq0W9h40karSRM92DOhMHRpulCB1fTvpVnPzmUyb957mXkI3VIyoYTO73cT3akJI4xNKK0ZoWdoRAVvKkan4hyNCJnVFResvkRXKe1IKUHZCH0vgtd9jOAzVvttiC63HbYhYuczbKPKzm9xbiNAZVwHPrng1R4_Vemwdx6w7b3OBUsnQcCd0m2BE_4pRji1KoLBHXQhHnHqdYtVwsuP9TW6sGqf4Oavj9H65flz8Vqt3pdvi8dVpVld54pzbhgVjdWbZmOa2ZwQtTECpgRqIjSfGWuhodYawfkM9IaXmWhVz5URlojpGN0NvocYvnpIWe5CH8sDSTJe9g3hzbSw6MDSMaQUwcpDdJ2KR0mJPMUrd7LEK0_xyiHeonkYNFDO_3YQZdIOvAbjIugsTXD_qH8BNwSFNA</recordid><startdate>20220801</startdate><enddate>20220801</enddate><creator>Szyszka, Anna</creator><creator>Woźniak, Maciej</creator><creator>Schaefer, Robert</creator><general>Elsevier B.V</general><general>Elsevier BV</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7TB</scope><scope>8FD</scope><scope>FR3</scope><scope>JQ2</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-3179-7863</orcidid><orcidid>https://orcid.org/0000-0002-5576-5671</orcidid><orcidid>https://orcid.org/0000-0002-1669-2086</orcidid></search><sort><creationdate>20220801</creationdate><title>Concurrent algorithm for integrating three-dimensional B-spline functions into machines with shared memory such as GPU</title><author>Szyszka, Anna ; Woźniak, Maciej ; Schaefer, Robert</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c255t-777d2189fcb9bd94600abd8e30e508c74dffe91ffd8774ecb7ffe0ca56ad8f083</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Algorithms</topic><topic>B spline functions</topic><topic>Canonical forms</topic><topic>Computer architecture</topic><topic>Concurrency</topic><topic>Finite element method</topic><topic>GPGPU</topic><topic>Isogeometric finite element method</topic><topic>Mathematical analysis</topic><topic>Numerical integration</topic><topic>Quadratures</topic><topic>Scheduling</topic><topic>Trace theory</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Szyszka, Anna</creatorcontrib><creatorcontrib>Woźniak, Maciej</creatorcontrib><creatorcontrib>Schaefer, Robert</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Mechanical & Transportation Engineering Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Computer methods in applied mechanics and engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Szyszka, Anna</au><au>Woźniak, Maciej</au><au>Schaefer, Robert</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Concurrent algorithm for integrating three-dimensional B-spline functions into machines with shared memory such as GPU</atitle><jtitle>Computer methods in applied mechanics and engineering</jtitle><date>2022-08-01</date><risdate>2022</risdate><volume>398</volume><spage>115201</spage><pages>115201-</pages><artnum>115201</artnum><issn>0045-7825</issn><eissn>1879-2138</eissn><abstract>The aim of this paper is to analyze the integration for 3D isogeometric finite element method solvers and its effective scheduling on hierarchical computer architecture. Data necessary for concurrency over elements is independent, so computation on this level is trivially concurrent. However, constructing several layers of concurrency for the integration algorithm is challenging. In this work, we propose a multilevel concurrent integration algorithm associated with scheduling that brings one extra degree of possible speedup. Because of one extra degree of possible speedup, we analyze the concurrent integration inside elements. The scheduling algorithm is intended for strongly related hierarchical architectures of a GPU. Using trace theory and Foata Normal Form, we verify integrity of the proposed solution. Summing up, we propose a general method for analyzing concurrency of the integration algorithm. We instantiate this method on a classical element-based integration algorithm, however, this methodology is possible to apply for other integration algorithms, including sum factorization, fast numerical quadrature, or row-wise integration methods. •Multilayer parallel integrating coefficients of IsoGeometric Analysis IGA equations.•Trace theory model used for the formal verification.•Dikert graph based sub-optimal scheduling on cluster with GPU units.•Efficiency and scalability tests of proposed parallel IGA for L2 projection problem.•Application to sum factorization, fast numerical quadrature, row-wise integration.</abstract><cop>Amsterdam</cop><pub>Elsevier B.V</pub><doi>10.1016/j.cma.2022.115201</doi><orcidid>https://orcid.org/0000-0002-3179-7863</orcidid><orcidid>https://orcid.org/0000-0002-5576-5671</orcidid><orcidid>https://orcid.org/0000-0002-1669-2086</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 0045-7825
ispartof	Computer methods in applied mechanics and engineering, 2022-08, Vol.398, p.115201, Article 115201
issn	0045-7825 1879-2138
language	eng
recordid	cdi_proquest_journals_2708390793
source	ScienceDirect Freedom Collection
subjects	Algorithms B spline functions Canonical forms Computer architecture Concurrency Finite element method GPGPU Isogeometric finite element method Mathematical analysis Numerical integration Quadratures Scheduling Trace theory
title	Concurrent algorithm for integrating three-dimensional B-spline functions into machines with shared memory such as GPU
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-24T01%3A57%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Concurrent%20algorithm%20for%20integrating%20three-dimensional%20B-spline%20functions%20into%20machines%20with%20shared%20memory%20such%20as%20GPU&rft.jtitle=Computer%20methods%20in%20applied%20mechanics%20and%20engineering&rft.au=Szyszka,%20Anna&rft.date=2022-08-01&rft.volume=398&rft.spage=115201&rft.pages=115201-&rft.artnum=115201&rft.issn=0045-7825&rft.eissn=1879-2138&rft_id=info:doi/10.1016/j.cma.2022.115201&rft_dat=%3Cproquest_cross%3E2708390793%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c255t-777d2189fcb9bd94600abd8e30e508c74dffe91ffd8774ecb7ffe0ca56ad8f083%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2708390793&rft_id=info:pmid/&rfr_iscdi=true