Loading…
Adaptively accelerating FWM2DA seismic modelling program on multi-core CPU and GPU architectures
This paper presents work done towards porting of FWM2DA, an open source program, on multi-core CPU and GPU architectures. FWM2DA is a Fortran90 sequential program which performs acoustic wave propagation of single source location for the 2D subsurface earth model using finite difference time domain...
Saved in:
Published in: | Computers & geosciences 2021-01, Vol.146, p.104637, Article 104637 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-a326t-25b8c118c4d1b5d1eacedf3a797a9b4ca9f2813b4b9a2a3689d049f3989f183 |
---|---|
cites | cdi_FETCH-LOGICAL-a326t-25b8c118c4d1b5d1eacedf3a797a9b4ca9f2813b4b9a2a3689d049f3989f183 |
container_end_page | |
container_issue | |
container_start_page | 104637 |
container_title | Computers & geosciences |
container_volume | 146 |
creator | Londhe, Ashutosh Rastogi, Richa Srivastava, Abhishek Khonde, Kiran Sirasala, Kirannmayi M. Kharche, Komal |
description | This paper presents work done towards porting of FWM2DA, an open source program, on multi-core CPU and GPU architectures. FWM2DA is a Fortran90 sequential program which performs acoustic wave propagation of single source location for the 2D subsurface earth model using finite difference time domain modelling. We have reproduced this program using C programming language and upgraded its functionality for performing acoustic wave propagation of multiple source locations and allowing different grid spacing in x and z direction for the subsurface earth model. Performance of the upgraded version is improved by implementing inter-node and intra-node parallelization. Inter-node parallelization is implemented using MPI for efficiently utilizing the distributed memory while intra-node parallelization focuses on efficient utilization of underlying architecture’s resources. Outcome of these programs are compared with that of the original FWM2DA program for Sigsbee2a model and found similar thus establishing the correctness of implementation done. The developed programs are tested using two layer subsurface earth model of 5000 X 5000 grid dimension on PARAM Shreshta system having Intel Xeon Gold 6148F Skylake CPU and NVIDIA Tesla V100 GPU. Ported programming models are evaluated for execution time for wavefield propagation of single source location using two layer subsurface earth model. Performance gain of 8.6X for OpenMP, 83.22X for OpenACC and 107.77X for CUDA C implementation are recorded over the sequential C program. The multi-core CPU program is further optimized and overall 29.37X performance is achieved with respect to the sequential C program. Performance of CUDA C program is also improved by making use of shared memory in GPU and 1.18X speedup is recorded with respect to the baseline CUDA C program. The numerical experiments demonstrate the effectiveness and robustness of the developed programs with high scalability and efficiency on multi-core CPU and GPU based HPC system.
•Adaptation and enhancement of FWM2DA open source program on CPU and GPU architecture.•Efficient implementation of FWM2DA using OpenMP, OpenACC and CUDA C programming models.•Performance gain of 8.6X for OpenMP, 83.22X for OpenACC and 107.77X for CUDA C is achieved.•Various optimization techniques has been detailed for CPU and GPU architecture.•Implemented parallelization strategies exhibits excellent MPI node scalability. |
doi_str_mv | 10.1016/j.cageo.2020.104637 |
format | article |
fullrecord | <record><control><sourceid>elsevier_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1016_j_cageo_2020_104637</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0098300420306142</els_id><sourcerecordid>S0098300420306142</sourcerecordid><originalsourceid>FETCH-LOGICAL-a326t-25b8c118c4d1b5d1eacedf3a797a9b4ca9f2813b4b9a2a3689d049f3989f183</originalsourceid><addsrcrecordid>eNp9kMtOwzAQRS0EEqXwBWz8Ayl-pIm9YFEVWpCKQALE0kzsSXGVR2Wnlfr3TShrVlea0Zm5OoTccjbhjGd3m4mFNbYTwcQwSTOZn5ERV7lMcsXkORkxplUiGUsvyVWMG8aYEGo6It8zB9vO77E6ULAWKwzQ-WZNF18v4mFGI_pYe0vr1mFVDYttaNcBato2tN5VnU9sG5DO3z4pNI4uhwz2x3dou13AeE0uSqgi3vzlmLwvHj_mT8nqdfk8n60SkCLrEjEtlOVc2dTxYuo4gkVXSsh1DrpILehSKC6LtNAgQGZKO5bqUmqlS67kmMjTVRvaGAOWZht8DeFgODODIrMxv4rMoMicFPXU_YnCvtjeYzDRemz6zz709Y1r_b_8EfVFcPg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Adaptively accelerating FWM2DA seismic modelling program on multi-core CPU and GPU architectures</title><source>ScienceDirect Freedom Collection</source><creator>Londhe, Ashutosh ; Rastogi, Richa ; Srivastava, Abhishek ; Khonde, Kiran ; Sirasala, Kirannmayi M. ; Kharche, Komal</creator><creatorcontrib>Londhe, Ashutosh ; Rastogi, Richa ; Srivastava, Abhishek ; Khonde, Kiran ; Sirasala, Kirannmayi M. ; Kharche, Komal</creatorcontrib><description>This paper presents work done towards porting of FWM2DA, an open source program, on multi-core CPU and GPU architectures. FWM2DA is a Fortran90 sequential program which performs acoustic wave propagation of single source location for the 2D subsurface earth model using finite difference time domain modelling. We have reproduced this program using C programming language and upgraded its functionality for performing acoustic wave propagation of multiple source locations and allowing different grid spacing in x and z direction for the subsurface earth model. Performance of the upgraded version is improved by implementing inter-node and intra-node parallelization. Inter-node parallelization is implemented using MPI for efficiently utilizing the distributed memory while intra-node parallelization focuses on efficient utilization of underlying architecture’s resources. Outcome of these programs are compared with that of the original FWM2DA program for Sigsbee2a model and found similar thus establishing the correctness of implementation done. The developed programs are tested using two layer subsurface earth model of 5000 X 5000 grid dimension on PARAM Shreshta system having Intel Xeon Gold 6148F Skylake CPU and NVIDIA Tesla V100 GPU. Ported programming models are evaluated for execution time for wavefield propagation of single source location using two layer subsurface earth model. Performance gain of 8.6X for OpenMP, 83.22X for OpenACC and 107.77X for CUDA C implementation are recorded over the sequential C program. The multi-core CPU program is further optimized and overall 29.37X performance is achieved with respect to the sequential C program. Performance of CUDA C program is also improved by making use of shared memory in GPU and 1.18X speedup is recorded with respect to the baseline CUDA C program. The numerical experiments demonstrate the effectiveness and robustness of the developed programs with high scalability and efficiency on multi-core CPU and GPU based HPC system.
•Adaptation and enhancement of FWM2DA open source program on CPU and GPU architecture.•Efficient implementation of FWM2DA using OpenMP, OpenACC and CUDA C programming models.•Performance gain of 8.6X for OpenMP, 83.22X for OpenACC and 107.77X for CUDA C is achieved.•Various optimization techniques has been detailed for CPU and GPU architecture.•Implemented parallelization strategies exhibits excellent MPI node scalability.</description><identifier>ISSN: 0098-3004</identifier><identifier>EISSN: 1873-7803</identifier><identifier>DOI: 10.1016/j.cageo.2020.104637</identifier><language>eng</language><publisher>Elsevier Ltd</publisher><ispartof>Computers & geosciences, 2021-01, Vol.146, p.104637, Article 104637</ispartof><rights>2020 Elsevier Ltd</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-a326t-25b8c118c4d1b5d1eacedf3a797a9b4ca9f2813b4b9a2a3689d049f3989f183</citedby><cites>FETCH-LOGICAL-a326t-25b8c118c4d1b5d1eacedf3a797a9b4ca9f2813b4b9a2a3689d049f3989f183</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Londhe, Ashutosh</creatorcontrib><creatorcontrib>Rastogi, Richa</creatorcontrib><creatorcontrib>Srivastava, Abhishek</creatorcontrib><creatorcontrib>Khonde, Kiran</creatorcontrib><creatorcontrib>Sirasala, Kirannmayi M.</creatorcontrib><creatorcontrib>Kharche, Komal</creatorcontrib><title>Adaptively accelerating FWM2DA seismic modelling program on multi-core CPU and GPU architectures</title><title>Computers & geosciences</title><description>This paper presents work done towards porting of FWM2DA, an open source program, on multi-core CPU and GPU architectures. FWM2DA is a Fortran90 sequential program which performs acoustic wave propagation of single source location for the 2D subsurface earth model using finite difference time domain modelling. We have reproduced this program using C programming language and upgraded its functionality for performing acoustic wave propagation of multiple source locations and allowing different grid spacing in x and z direction for the subsurface earth model. Performance of the upgraded version is improved by implementing inter-node and intra-node parallelization. Inter-node parallelization is implemented using MPI for efficiently utilizing the distributed memory while intra-node parallelization focuses on efficient utilization of underlying architecture’s resources. Outcome of these programs are compared with that of the original FWM2DA program for Sigsbee2a model and found similar thus establishing the correctness of implementation done. The developed programs are tested using two layer subsurface earth model of 5000 X 5000 grid dimension on PARAM Shreshta system having Intel Xeon Gold 6148F Skylake CPU and NVIDIA Tesla V100 GPU. Ported programming models are evaluated for execution time for wavefield propagation of single source location using two layer subsurface earth model. Performance gain of 8.6X for OpenMP, 83.22X for OpenACC and 107.77X for CUDA C implementation are recorded over the sequential C program. The multi-core CPU program is further optimized and overall 29.37X performance is achieved with respect to the sequential C program. Performance of CUDA C program is also improved by making use of shared memory in GPU and 1.18X speedup is recorded with respect to the baseline CUDA C program. The numerical experiments demonstrate the effectiveness and robustness of the developed programs with high scalability and efficiency on multi-core CPU and GPU based HPC system.
•Adaptation and enhancement of FWM2DA open source program on CPU and GPU architecture.•Efficient implementation of FWM2DA using OpenMP, OpenACC and CUDA C programming models.•Performance gain of 8.6X for OpenMP, 83.22X for OpenACC and 107.77X for CUDA C is achieved.•Various optimization techniques has been detailed for CPU and GPU architecture.•Implemented parallelization strategies exhibits excellent MPI node scalability.</description><issn>0098-3004</issn><issn>1873-7803</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNp9kMtOwzAQRS0EEqXwBWz8Ayl-pIm9YFEVWpCKQALE0kzsSXGVR2Wnlfr3TShrVlea0Zm5OoTccjbhjGd3m4mFNbYTwcQwSTOZn5ERV7lMcsXkORkxplUiGUsvyVWMG8aYEGo6It8zB9vO77E6ULAWKwzQ-WZNF18v4mFGI_pYe0vr1mFVDYttaNcBato2tN5VnU9sG5DO3z4pNI4uhwz2x3dou13AeE0uSqgi3vzlmLwvHj_mT8nqdfk8n60SkCLrEjEtlOVc2dTxYuo4gkVXSsh1DrpILehSKC6LtNAgQGZKO5bqUmqlS67kmMjTVRvaGAOWZht8DeFgODODIrMxv4rMoMicFPXU_YnCvtjeYzDRemz6zz709Y1r_b_8EfVFcPg</recordid><startdate>202101</startdate><enddate>202101</enddate><creator>Londhe, Ashutosh</creator><creator>Rastogi, Richa</creator><creator>Srivastava, Abhishek</creator><creator>Khonde, Kiran</creator><creator>Sirasala, Kirannmayi M.</creator><creator>Kharche, Komal</creator><general>Elsevier Ltd</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>202101</creationdate><title>Adaptively accelerating FWM2DA seismic modelling program on multi-core CPU and GPU architectures</title><author>Londhe, Ashutosh ; Rastogi, Richa ; Srivastava, Abhishek ; Khonde, Kiran ; Sirasala, Kirannmayi M. ; Kharche, Komal</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a326t-25b8c118c4d1b5d1eacedf3a797a9b4ca9f2813b4b9a2a3689d049f3989f183</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Londhe, Ashutosh</creatorcontrib><creatorcontrib>Rastogi, Richa</creatorcontrib><creatorcontrib>Srivastava, Abhishek</creatorcontrib><creatorcontrib>Khonde, Kiran</creatorcontrib><creatorcontrib>Sirasala, Kirannmayi M.</creatorcontrib><creatorcontrib>Kharche, Komal</creatorcontrib><collection>CrossRef</collection><jtitle>Computers & geosciences</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Londhe, Ashutosh</au><au>Rastogi, Richa</au><au>Srivastava, Abhishek</au><au>Khonde, Kiran</au><au>Sirasala, Kirannmayi M.</au><au>Kharche, Komal</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Adaptively accelerating FWM2DA seismic modelling program on multi-core CPU and GPU architectures</atitle><jtitle>Computers & geosciences</jtitle><date>2021-01</date><risdate>2021</risdate><volume>146</volume><spage>104637</spage><pages>104637-</pages><artnum>104637</artnum><issn>0098-3004</issn><eissn>1873-7803</eissn><abstract>This paper presents work done towards porting of FWM2DA, an open source program, on multi-core CPU and GPU architectures. FWM2DA is a Fortran90 sequential program which performs acoustic wave propagation of single source location for the 2D subsurface earth model using finite difference time domain modelling. We have reproduced this program using C programming language and upgraded its functionality for performing acoustic wave propagation of multiple source locations and allowing different grid spacing in x and z direction for the subsurface earth model. Performance of the upgraded version is improved by implementing inter-node and intra-node parallelization. Inter-node parallelization is implemented using MPI for efficiently utilizing the distributed memory while intra-node parallelization focuses on efficient utilization of underlying architecture’s resources. Outcome of these programs are compared with that of the original FWM2DA program for Sigsbee2a model and found similar thus establishing the correctness of implementation done. The developed programs are tested using two layer subsurface earth model of 5000 X 5000 grid dimension on PARAM Shreshta system having Intel Xeon Gold 6148F Skylake CPU and NVIDIA Tesla V100 GPU. Ported programming models are evaluated for execution time for wavefield propagation of single source location using two layer subsurface earth model. Performance gain of 8.6X for OpenMP, 83.22X for OpenACC and 107.77X for CUDA C implementation are recorded over the sequential C program. The multi-core CPU program is further optimized and overall 29.37X performance is achieved with respect to the sequential C program. Performance of CUDA C program is also improved by making use of shared memory in GPU and 1.18X speedup is recorded with respect to the baseline CUDA C program. The numerical experiments demonstrate the effectiveness and robustness of the developed programs with high scalability and efficiency on multi-core CPU and GPU based HPC system.
•Adaptation and enhancement of FWM2DA open source program on CPU and GPU architecture.•Efficient implementation of FWM2DA using OpenMP, OpenACC and CUDA C programming models.•Performance gain of 8.6X for OpenMP, 83.22X for OpenACC and 107.77X for CUDA C is achieved.•Various optimization techniques has been detailed for CPU and GPU architecture.•Implemented parallelization strategies exhibits excellent MPI node scalability.</abstract><pub>Elsevier Ltd</pub><doi>10.1016/j.cageo.2020.104637</doi></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0098-3004 |
ispartof | Computers & geosciences, 2021-01, Vol.146, p.104637, Article 104637 |
issn | 0098-3004 1873-7803 |
language | eng |
recordid | cdi_crossref_primary_10_1016_j_cageo_2020_104637 |
source | ScienceDirect Freedom Collection |
title | Adaptively accelerating FWM2DA seismic modelling program on multi-core CPU and GPU architectures |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T15%3A03%3A31IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-elsevier_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Adaptively%20accelerating%20FWM2DA%20seismic%20modelling%20program%20on%20multi-core%20CPU%20and%20GPU%20architectures&rft.jtitle=Computers%20&%20geosciences&rft.au=Londhe,%20Ashutosh&rft.date=2021-01&rft.volume=146&rft.spage=104637&rft.pages=104637-&rft.artnum=104637&rft.issn=0098-3004&rft.eissn=1873-7803&rft_id=info:doi/10.1016/j.cageo.2020.104637&rft_dat=%3Celsevier_cross%3ES0098300420306142%3C/elsevier_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-a326t-25b8c118c4d1b5d1eacedf3a797a9b4ca9f2813b4b9a2a3689d049f3989f183%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |