Loading…

Adaptively accelerating FWM2DA seismic modelling program on multi-core CPU and GPU architectures

This paper presents work done towards porting of FWM2DA, an open source program, on multi-core CPU and GPU architectures. FWM2DA is a Fortran90 sequential program which performs acoustic wave propagation of single source location for the 2D subsurface earth model using finite difference time domain...

Full description

Saved in:
Bibliographic Details
Published in:Computers & geosciences 2021-01, Vol.146, p.104637, Article 104637
Main Authors: Londhe, Ashutosh, Rastogi, Richa, Srivastava, Abhishek, Khonde, Kiran, Sirasala, Kirannmayi M., Kharche, Komal
Format: Article
Language:English
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-a326t-25b8c118c4d1b5d1eacedf3a797a9b4ca9f2813b4b9a2a3689d049f3989f183
cites cdi_FETCH-LOGICAL-a326t-25b8c118c4d1b5d1eacedf3a797a9b4ca9f2813b4b9a2a3689d049f3989f183
container_end_page
container_issue
container_start_page 104637
container_title Computers & geosciences
container_volume 146
creator Londhe, Ashutosh
Rastogi, Richa
Srivastava, Abhishek
Khonde, Kiran
Sirasala, Kirannmayi M.
Kharche, Komal
description This paper presents work done towards porting of FWM2DA, an open source program, on multi-core CPU and GPU architectures. FWM2DA is a Fortran90 sequential program which performs acoustic wave propagation of single source location for the 2D subsurface earth model using finite difference time domain modelling. We have reproduced this program using C programming language and upgraded its functionality for performing acoustic wave propagation of multiple source locations and allowing different grid spacing in x and z direction for the subsurface earth model. Performance of the upgraded version is improved by implementing inter-node and intra-node parallelization. Inter-node parallelization is implemented using MPI for efficiently utilizing the distributed memory while intra-node parallelization focuses on efficient utilization of underlying architecture’s resources. Outcome of these programs are compared with that of the original FWM2DA program for Sigsbee2a model and found similar thus establishing the correctness of implementation done. The developed programs are tested using two layer subsurface earth model of 5000 X 5000 grid dimension on PARAM Shreshta system having Intel Xeon Gold 6148F Skylake CPU and NVIDIA Tesla V100 GPU. Ported programming models are evaluated for execution time for wavefield propagation of single source location using two layer subsurface earth model. Performance gain of 8.6X for OpenMP, 83.22X for OpenACC and 107.77X for CUDA C implementation are recorded over the sequential C program. The multi-core CPU program is further optimized and overall 29.37X performance is achieved with respect to the sequential C program. Performance of CUDA C program is also improved by making use of shared memory in GPU and 1.18X speedup is recorded with respect to the baseline CUDA C program. The numerical experiments demonstrate the effectiveness and robustness of the developed programs with high scalability and efficiency on multi-core CPU and GPU based HPC system. •Adaptation and enhancement of FWM2DA open source program on CPU and GPU architecture.•Efficient implementation of FWM2DA using OpenMP, OpenACC and CUDA C programming models.•Performance gain of 8.6X for OpenMP, 83.22X for OpenACC and 107.77X for CUDA C is achieved.•Various optimization techniques has been detailed for CPU and GPU architecture.•Implemented parallelization strategies exhibits excellent MPI node scalability.
doi_str_mv 10.1016/j.cageo.2020.104637
format article
fullrecord <record><control><sourceid>elsevier_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1016_j_cageo_2020_104637</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0098300420306142</els_id><sourcerecordid>S0098300420306142</sourcerecordid><originalsourceid>FETCH-LOGICAL-a326t-25b8c118c4d1b5d1eacedf3a797a9b4ca9f2813b4b9a2a3689d049f3989f183</originalsourceid><addsrcrecordid>eNp9kMtOwzAQRS0EEqXwBWz8Ayl-pIm9YFEVWpCKQALE0kzsSXGVR2Wnlfr3TShrVlea0Zm5OoTccjbhjGd3m4mFNbYTwcQwSTOZn5ERV7lMcsXkORkxplUiGUsvyVWMG8aYEGo6It8zB9vO77E6ULAWKwzQ-WZNF18v4mFGI_pYe0vr1mFVDYttaNcBato2tN5VnU9sG5DO3z4pNI4uhwz2x3dou13AeE0uSqgi3vzlmLwvHj_mT8nqdfk8n60SkCLrEjEtlOVc2dTxYuo4gkVXSsh1DrpILehSKC6LtNAgQGZKO5bqUmqlS67kmMjTVRvaGAOWZht8DeFgODODIrMxv4rMoMicFPXU_YnCvtjeYzDRemz6zz709Y1r_b_8EfVFcPg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Adaptively accelerating FWM2DA seismic modelling program on multi-core CPU and GPU architectures</title><source>ScienceDirect Freedom Collection</source><creator>Londhe, Ashutosh ; Rastogi, Richa ; Srivastava, Abhishek ; Khonde, Kiran ; Sirasala, Kirannmayi M. ; Kharche, Komal</creator><creatorcontrib>Londhe, Ashutosh ; Rastogi, Richa ; Srivastava, Abhishek ; Khonde, Kiran ; Sirasala, Kirannmayi M. ; Kharche, Komal</creatorcontrib><description>This paper presents work done towards porting of FWM2DA, an open source program, on multi-core CPU and GPU architectures. FWM2DA is a Fortran90 sequential program which performs acoustic wave propagation of single source location for the 2D subsurface earth model using finite difference time domain modelling. We have reproduced this program using C programming language and upgraded its functionality for performing acoustic wave propagation of multiple source locations and allowing different grid spacing in x and z direction for the subsurface earth model. Performance of the upgraded version is improved by implementing inter-node and intra-node parallelization. Inter-node parallelization is implemented using MPI for efficiently utilizing the distributed memory while intra-node parallelization focuses on efficient utilization of underlying architecture’s resources. Outcome of these programs are compared with that of the original FWM2DA program for Sigsbee2a model and found similar thus establishing the correctness of implementation done. The developed programs are tested using two layer subsurface earth model of 5000 X 5000 grid dimension on PARAM Shreshta system having Intel Xeon Gold 6148F Skylake CPU and NVIDIA Tesla V100 GPU. Ported programming models are evaluated for execution time for wavefield propagation of single source location using two layer subsurface earth model. Performance gain of 8.6X for OpenMP, 83.22X for OpenACC and 107.77X for CUDA C implementation are recorded over the sequential C program. The multi-core CPU program is further optimized and overall 29.37X performance is achieved with respect to the sequential C program. Performance of CUDA C program is also improved by making use of shared memory in GPU and 1.18X speedup is recorded with respect to the baseline CUDA C program. The numerical experiments demonstrate the effectiveness and robustness of the developed programs with high scalability and efficiency on multi-core CPU and GPU based HPC system. •Adaptation and enhancement of FWM2DA open source program on CPU and GPU architecture.•Efficient implementation of FWM2DA using OpenMP, OpenACC and CUDA C programming models.•Performance gain of 8.6X for OpenMP, 83.22X for OpenACC and 107.77X for CUDA C is achieved.•Various optimization techniques has been detailed for CPU and GPU architecture.•Implemented parallelization strategies exhibits excellent MPI node scalability.</description><identifier>ISSN: 0098-3004</identifier><identifier>EISSN: 1873-7803</identifier><identifier>DOI: 10.1016/j.cageo.2020.104637</identifier><language>eng</language><publisher>Elsevier Ltd</publisher><ispartof>Computers &amp; geosciences, 2021-01, Vol.146, p.104637, Article 104637</ispartof><rights>2020 Elsevier Ltd</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-a326t-25b8c118c4d1b5d1eacedf3a797a9b4ca9f2813b4b9a2a3689d049f3989f183</citedby><cites>FETCH-LOGICAL-a326t-25b8c118c4d1b5d1eacedf3a797a9b4ca9f2813b4b9a2a3689d049f3989f183</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Londhe, Ashutosh</creatorcontrib><creatorcontrib>Rastogi, Richa</creatorcontrib><creatorcontrib>Srivastava, Abhishek</creatorcontrib><creatorcontrib>Khonde, Kiran</creatorcontrib><creatorcontrib>Sirasala, Kirannmayi M.</creatorcontrib><creatorcontrib>Kharche, Komal</creatorcontrib><title>Adaptively accelerating FWM2DA seismic modelling program on multi-core CPU and GPU architectures</title><title>Computers &amp; geosciences</title><description>This paper presents work done towards porting of FWM2DA, an open source program, on multi-core CPU and GPU architectures. FWM2DA is a Fortran90 sequential program which performs acoustic wave propagation of single source location for the 2D subsurface earth model using finite difference time domain modelling. We have reproduced this program using C programming language and upgraded its functionality for performing acoustic wave propagation of multiple source locations and allowing different grid spacing in x and z direction for the subsurface earth model. Performance of the upgraded version is improved by implementing inter-node and intra-node parallelization. Inter-node parallelization is implemented using MPI for efficiently utilizing the distributed memory while intra-node parallelization focuses on efficient utilization of underlying architecture’s resources. Outcome of these programs are compared with that of the original FWM2DA program for Sigsbee2a model and found similar thus establishing the correctness of implementation done. The developed programs are tested using two layer subsurface earth model of 5000 X 5000 grid dimension on PARAM Shreshta system having Intel Xeon Gold 6148F Skylake CPU and NVIDIA Tesla V100 GPU. Ported programming models are evaluated for execution time for wavefield propagation of single source location using two layer subsurface earth model. Performance gain of 8.6X for OpenMP, 83.22X for OpenACC and 107.77X for CUDA C implementation are recorded over the sequential C program. The multi-core CPU program is further optimized and overall 29.37X performance is achieved with respect to the sequential C program. Performance of CUDA C program is also improved by making use of shared memory in GPU and 1.18X speedup is recorded with respect to the baseline CUDA C program. The numerical experiments demonstrate the effectiveness and robustness of the developed programs with high scalability and efficiency on multi-core CPU and GPU based HPC system. •Adaptation and enhancement of FWM2DA open source program on CPU and GPU architecture.•Efficient implementation of FWM2DA using OpenMP, OpenACC and CUDA C programming models.•Performance gain of 8.6X for OpenMP, 83.22X for OpenACC and 107.77X for CUDA C is achieved.•Various optimization techniques has been detailed for CPU and GPU architecture.•Implemented parallelization strategies exhibits excellent MPI node scalability.</description><issn>0098-3004</issn><issn>1873-7803</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNp9kMtOwzAQRS0EEqXwBWz8Ayl-pIm9YFEVWpCKQALE0kzsSXGVR2Wnlfr3TShrVlea0Zm5OoTccjbhjGd3m4mFNbYTwcQwSTOZn5ERV7lMcsXkORkxplUiGUsvyVWMG8aYEGo6It8zB9vO77E6ULAWKwzQ-WZNF18v4mFGI_pYe0vr1mFVDYttaNcBato2tN5VnU9sG5DO3z4pNI4uhwz2x3dou13AeE0uSqgi3vzlmLwvHj_mT8nqdfk8n60SkCLrEjEtlOVc2dTxYuo4gkVXSsh1DrpILehSKC6LtNAgQGZKO5bqUmqlS67kmMjTVRvaGAOWZht8DeFgODODIrMxv4rMoMicFPXU_YnCvtjeYzDRemz6zz709Y1r_b_8EfVFcPg</recordid><startdate>202101</startdate><enddate>202101</enddate><creator>Londhe, Ashutosh</creator><creator>Rastogi, Richa</creator><creator>Srivastava, Abhishek</creator><creator>Khonde, Kiran</creator><creator>Sirasala, Kirannmayi M.</creator><creator>Kharche, Komal</creator><general>Elsevier Ltd</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>202101</creationdate><title>Adaptively accelerating FWM2DA seismic modelling program on multi-core CPU and GPU architectures</title><author>Londhe, Ashutosh ; Rastogi, Richa ; Srivastava, Abhishek ; Khonde, Kiran ; Sirasala, Kirannmayi M. ; Kharche, Komal</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a326t-25b8c118c4d1b5d1eacedf3a797a9b4ca9f2813b4b9a2a3689d049f3989f183</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Londhe, Ashutosh</creatorcontrib><creatorcontrib>Rastogi, Richa</creatorcontrib><creatorcontrib>Srivastava, Abhishek</creatorcontrib><creatorcontrib>Khonde, Kiran</creatorcontrib><creatorcontrib>Sirasala, Kirannmayi M.</creatorcontrib><creatorcontrib>Kharche, Komal</creatorcontrib><collection>CrossRef</collection><jtitle>Computers &amp; geosciences</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Londhe, Ashutosh</au><au>Rastogi, Richa</au><au>Srivastava, Abhishek</au><au>Khonde, Kiran</au><au>Sirasala, Kirannmayi M.</au><au>Kharche, Komal</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Adaptively accelerating FWM2DA seismic modelling program on multi-core CPU and GPU architectures</atitle><jtitle>Computers &amp; geosciences</jtitle><date>2021-01</date><risdate>2021</risdate><volume>146</volume><spage>104637</spage><pages>104637-</pages><artnum>104637</artnum><issn>0098-3004</issn><eissn>1873-7803</eissn><abstract>This paper presents work done towards porting of FWM2DA, an open source program, on multi-core CPU and GPU architectures. FWM2DA is a Fortran90 sequential program which performs acoustic wave propagation of single source location for the 2D subsurface earth model using finite difference time domain modelling. We have reproduced this program using C programming language and upgraded its functionality for performing acoustic wave propagation of multiple source locations and allowing different grid spacing in x and z direction for the subsurface earth model. Performance of the upgraded version is improved by implementing inter-node and intra-node parallelization. Inter-node parallelization is implemented using MPI for efficiently utilizing the distributed memory while intra-node parallelization focuses on efficient utilization of underlying architecture’s resources. Outcome of these programs are compared with that of the original FWM2DA program for Sigsbee2a model and found similar thus establishing the correctness of implementation done. The developed programs are tested using two layer subsurface earth model of 5000 X 5000 grid dimension on PARAM Shreshta system having Intel Xeon Gold 6148F Skylake CPU and NVIDIA Tesla V100 GPU. Ported programming models are evaluated for execution time for wavefield propagation of single source location using two layer subsurface earth model. Performance gain of 8.6X for OpenMP, 83.22X for OpenACC and 107.77X for CUDA C implementation are recorded over the sequential C program. The multi-core CPU program is further optimized and overall 29.37X performance is achieved with respect to the sequential C program. Performance of CUDA C program is also improved by making use of shared memory in GPU and 1.18X speedup is recorded with respect to the baseline CUDA C program. The numerical experiments demonstrate the effectiveness and robustness of the developed programs with high scalability and efficiency on multi-core CPU and GPU based HPC system. •Adaptation and enhancement of FWM2DA open source program on CPU and GPU architecture.•Efficient implementation of FWM2DA using OpenMP, OpenACC and CUDA C programming models.•Performance gain of 8.6X for OpenMP, 83.22X for OpenACC and 107.77X for CUDA C is achieved.•Various optimization techniques has been detailed for CPU and GPU architecture.•Implemented parallelization strategies exhibits excellent MPI node scalability.</abstract><pub>Elsevier Ltd</pub><doi>10.1016/j.cageo.2020.104637</doi></addata></record>
fulltext fulltext
identifier ISSN: 0098-3004
ispartof Computers & geosciences, 2021-01, Vol.146, p.104637, Article 104637
issn 0098-3004
1873-7803
language eng
recordid cdi_crossref_primary_10_1016_j_cageo_2020_104637
source ScienceDirect Freedom Collection
title Adaptively accelerating FWM2DA seismic modelling program on multi-core CPU and GPU architectures
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T15%3A03%3A31IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-elsevier_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Adaptively%20accelerating%20FWM2DA%20seismic%20modelling%20program%20on%20multi-core%20CPU%20and%20GPU%20architectures&rft.jtitle=Computers%20&%20geosciences&rft.au=Londhe,%20Ashutosh&rft.date=2021-01&rft.volume=146&rft.spage=104637&rft.pages=104637-&rft.artnum=104637&rft.issn=0098-3004&rft.eissn=1873-7803&rft_id=info:doi/10.1016/j.cageo.2020.104637&rft_dat=%3Celsevier_cross%3ES0098300420306142%3C/elsevier_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-a326t-25b8c118c4d1b5d1eacedf3a797a9b4ca9f2813b4b9a2a3689d049f3989f183%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true