Loading…
Efficient simulation execution of cellular automata on GPU
Graphics Processing Units (GPUs) can be used as convenient hardware accelerators to speed up Cellular Automata (CA) simulations, which are employed in many scientific areas. However, an important set of CA have performance constraints due to GPU memory bandwidth. Few studies have fully explored how...
Saved in:
Published in: | Simulation modelling practice and theory 2022-07, Vol.118, p.102519, Article 102519 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c352t-fe4a0a85a015ed5c303e5145b20df8332ae0dfc933ce0a94a788885a87dc5eef3 |
---|---|
cites | cdi_FETCH-LOGICAL-c352t-fe4a0a85a015ed5c303e5145b20df8332ae0dfc933ce0a94a788885a87dc5eef3 |
container_end_page | |
container_issue | |
container_start_page | 102519 |
container_title | Simulation modelling practice and theory |
container_volume | 118 |
creator | Cagigas-Muñiz, Daniel Diaz-del-Rio, Fernando Sevillano-Ramos, Jose Luis Guisado-Lizar, Jose-Luis |
description | Graphics Processing Units (GPUs) can be used as convenient hardware accelerators to speed up Cellular Automata (CA) simulations, which are employed in many scientific areas. However, an important set of CA have performance constraints due to GPU memory bandwidth. Few studies have fully explored how CA implementations can take advantage of modern GPU architectures, mainly in the case of intensive memory usage. In this paper, we make a thorough study of techniques (stencil computing framework, look-up tables, and packet coding) to efficiently implement CA on GPU, taking into account its detailed architecture. Exhaustive experiments to validate these implementation techniques for a number of significant memory-bounded CA are performed. The CA analysed include the classical Game of Life, a Forest Fire model, a Cyclic cellular automaton, and the WireWorld CA. The experimental results show that implementations using the presented techniques can significantly outperform a baseline standard GPU implementation. The best performance results of all known implementations of memory bounded CA were obtained. Moreover, some of the techniques, like look-up tables or temporal blocking, are indeed relatively easy to implement or to apply when the transition rules are simple. Finally, detailed descriptions and discussions of the indicated techniques are included, which may be useful to practitioners interested in developing high performance simulations in efficient languages based on CA on GPU.
•A thorough revision of Cellular Automata implementations in modern GPUs.•Study of which general aspects of GPU architectures influence in the performance of Cellular Automata implementations.•Novel techniques that improve memory bounded Cellular Automata performance in GPUs.•Experimental results on bi-dimensional CA to compare novel techniques of performance. |
doi_str_mv | 10.1016/j.simpat.2022.102519 |
format | article |
fullrecord | <record><control><sourceid>elsevier_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1016_j_simpat_2022_102519</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S1569190X22000259</els_id><sourcerecordid>S1569190X22000259</sourcerecordid><originalsourceid>FETCH-LOGICAL-c352t-fe4a0a85a015ed5c303e5145b20df8332ae0dfc933ce0a94a788885a87dc5eef3</originalsourceid><addsrcrecordid>eNp9j09LxDAQxYMouK5-Aw_9Aq3507SpB0GWdRUW9OCCtzCmE0hpt0uSin57s1vPzmUeb3iP-RFyy2jBKKvuuiK44QCx4JTzZHHJmjOyYKpWOSsrfp60rJqcNfTjklyF0FHKlKrqBblfW-uMw33MUsfUQ3TjPsNvNNNJjTYz2Pfp4DOY4jhAhCz5m7fdNbmw0Ae8-dtLsntav6-e8-3r5mX1uM2NkDzmFkugoCRQJrGVRlCBkpXyk9PWKiE4YBKmEcIghaaEWqWRoOrWSEQrlqSce40fQ_Bo9cG7AfyPZlQf-XWnZ3595Nczf4o9zDFMv3059DocOQ22zqOJuh3d_wW_IfBmVA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Efficient simulation execution of cellular automata on GPU</title><source>ScienceDirect Freedom Collection</source><creator>Cagigas-Muñiz, Daniel ; Diaz-del-Rio, Fernando ; Sevillano-Ramos, Jose Luis ; Guisado-Lizar, Jose-Luis</creator><creatorcontrib>Cagigas-Muñiz, Daniel ; Diaz-del-Rio, Fernando ; Sevillano-Ramos, Jose Luis ; Guisado-Lizar, Jose-Luis</creatorcontrib><description>Graphics Processing Units (GPUs) can be used as convenient hardware accelerators to speed up Cellular Automata (CA) simulations, which are employed in many scientific areas. However, an important set of CA have performance constraints due to GPU memory bandwidth. Few studies have fully explored how CA implementations can take advantage of modern GPU architectures, mainly in the case of intensive memory usage. In this paper, we make a thorough study of techniques (stencil computing framework, look-up tables, and packet coding) to efficiently implement CA on GPU, taking into account its detailed architecture. Exhaustive experiments to validate these implementation techniques for a number of significant memory-bounded CA are performed. The CA analysed include the classical Game of Life, a Forest Fire model, a Cyclic cellular automaton, and the WireWorld CA. The experimental results show that implementations using the presented techniques can significantly outperform a baseline standard GPU implementation. The best performance results of all known implementations of memory bounded CA were obtained. Moreover, some of the techniques, like look-up tables or temporal blocking, are indeed relatively easy to implement or to apply when the transition rules are simple. Finally, detailed descriptions and discussions of the indicated techniques are included, which may be useful to practitioners interested in developing high performance simulations in efficient languages based on CA on GPU.
•A thorough revision of Cellular Automata implementations in modern GPUs.•Study of which general aspects of GPU architectures influence in the performance of Cellular Automata implementations.•Novel techniques that improve memory bounded Cellular Automata performance in GPUs.•Experimental results on bi-dimensional CA to compare novel techniques of performance.</description><identifier>ISSN: 1569-190X</identifier><identifier>EISSN: 1878-1462</identifier><identifier>DOI: 10.1016/j.simpat.2022.102519</identifier><language>eng</language><publisher>Elsevier B.V</publisher><subject>Cellular automata ; Graphics Processing Units ; Parallel computing ; Performance optimization ; Stencil computation</subject><ispartof>Simulation modelling practice and theory, 2022-07, Vol.118, p.102519, Article 102519</ispartof><rights>2022 The Authors</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c352t-fe4a0a85a015ed5c303e5145b20df8332ae0dfc933ce0a94a788885a87dc5eef3</citedby><cites>FETCH-LOGICAL-c352t-fe4a0a85a015ed5c303e5145b20df8332ae0dfc933ce0a94a788885a87dc5eef3</cites><orcidid>0000-0002-2792-2844</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27923,27924</link.rule.ids></links><search><creatorcontrib>Cagigas-Muñiz, Daniel</creatorcontrib><creatorcontrib>Diaz-del-Rio, Fernando</creatorcontrib><creatorcontrib>Sevillano-Ramos, Jose Luis</creatorcontrib><creatorcontrib>Guisado-Lizar, Jose-Luis</creatorcontrib><title>Efficient simulation execution of cellular automata on GPU</title><title>Simulation modelling practice and theory</title><description>Graphics Processing Units (GPUs) can be used as convenient hardware accelerators to speed up Cellular Automata (CA) simulations, which are employed in many scientific areas. However, an important set of CA have performance constraints due to GPU memory bandwidth. Few studies have fully explored how CA implementations can take advantage of modern GPU architectures, mainly in the case of intensive memory usage. In this paper, we make a thorough study of techniques (stencil computing framework, look-up tables, and packet coding) to efficiently implement CA on GPU, taking into account its detailed architecture. Exhaustive experiments to validate these implementation techniques for a number of significant memory-bounded CA are performed. The CA analysed include the classical Game of Life, a Forest Fire model, a Cyclic cellular automaton, and the WireWorld CA. The experimental results show that implementations using the presented techniques can significantly outperform a baseline standard GPU implementation. The best performance results of all known implementations of memory bounded CA were obtained. Moreover, some of the techniques, like look-up tables or temporal blocking, are indeed relatively easy to implement or to apply when the transition rules are simple. Finally, detailed descriptions and discussions of the indicated techniques are included, which may be useful to practitioners interested in developing high performance simulations in efficient languages based on CA on GPU.
•A thorough revision of Cellular Automata implementations in modern GPUs.•Study of which general aspects of GPU architectures influence in the performance of Cellular Automata implementations.•Novel techniques that improve memory bounded Cellular Automata performance in GPUs.•Experimental results on bi-dimensional CA to compare novel techniques of performance.</description><subject>Cellular automata</subject><subject>Graphics Processing Units</subject><subject>Parallel computing</subject><subject>Performance optimization</subject><subject>Stencil computation</subject><issn>1569-190X</issn><issn>1878-1462</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNp9j09LxDAQxYMouK5-Aw_9Aq3507SpB0GWdRUW9OCCtzCmE0hpt0uSin57s1vPzmUeb3iP-RFyy2jBKKvuuiK44QCx4JTzZHHJmjOyYKpWOSsrfp60rJqcNfTjklyF0FHKlKrqBblfW-uMw33MUsfUQ3TjPsNvNNNJjTYz2Pfp4DOY4jhAhCz5m7fdNbmw0Ae8-dtLsntav6-e8-3r5mX1uM2NkDzmFkugoCRQJrGVRlCBkpXyk9PWKiE4YBKmEcIghaaEWqWRoOrWSEQrlqSce40fQ_Bo9cG7AfyPZlQf-XWnZ3595Nczf4o9zDFMv3059DocOQ22zqOJuh3d_wW_IfBmVA</recordid><startdate>202207</startdate><enddate>202207</enddate><creator>Cagigas-Muñiz, Daniel</creator><creator>Diaz-del-Rio, Fernando</creator><creator>Sevillano-Ramos, Jose Luis</creator><creator>Guisado-Lizar, Jose-Luis</creator><general>Elsevier B.V</general><scope>6I.</scope><scope>AAFTH</scope><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-2792-2844</orcidid></search><sort><creationdate>202207</creationdate><title>Efficient simulation execution of cellular automata on GPU</title><author>Cagigas-Muñiz, Daniel ; Diaz-del-Rio, Fernando ; Sevillano-Ramos, Jose Luis ; Guisado-Lizar, Jose-Luis</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c352t-fe4a0a85a015ed5c303e5145b20df8332ae0dfc933ce0a94a788885a87dc5eef3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Cellular automata</topic><topic>Graphics Processing Units</topic><topic>Parallel computing</topic><topic>Performance optimization</topic><topic>Stencil computation</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Cagigas-Muñiz, Daniel</creatorcontrib><creatorcontrib>Diaz-del-Rio, Fernando</creatorcontrib><creatorcontrib>Sevillano-Ramos, Jose Luis</creatorcontrib><creatorcontrib>Guisado-Lizar, Jose-Luis</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>CrossRef</collection><jtitle>Simulation modelling practice and theory</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Cagigas-Muñiz, Daniel</au><au>Diaz-del-Rio, Fernando</au><au>Sevillano-Ramos, Jose Luis</au><au>Guisado-Lizar, Jose-Luis</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Efficient simulation execution of cellular automata on GPU</atitle><jtitle>Simulation modelling practice and theory</jtitle><date>2022-07</date><risdate>2022</risdate><volume>118</volume><spage>102519</spage><pages>102519-</pages><artnum>102519</artnum><issn>1569-190X</issn><eissn>1878-1462</eissn><abstract>Graphics Processing Units (GPUs) can be used as convenient hardware accelerators to speed up Cellular Automata (CA) simulations, which are employed in many scientific areas. However, an important set of CA have performance constraints due to GPU memory bandwidth. Few studies have fully explored how CA implementations can take advantage of modern GPU architectures, mainly in the case of intensive memory usage. In this paper, we make a thorough study of techniques (stencil computing framework, look-up tables, and packet coding) to efficiently implement CA on GPU, taking into account its detailed architecture. Exhaustive experiments to validate these implementation techniques for a number of significant memory-bounded CA are performed. The CA analysed include the classical Game of Life, a Forest Fire model, a Cyclic cellular automaton, and the WireWorld CA. The experimental results show that implementations using the presented techniques can significantly outperform a baseline standard GPU implementation. The best performance results of all known implementations of memory bounded CA were obtained. Moreover, some of the techniques, like look-up tables or temporal blocking, are indeed relatively easy to implement or to apply when the transition rules are simple. Finally, detailed descriptions and discussions of the indicated techniques are included, which may be useful to practitioners interested in developing high performance simulations in efficient languages based on CA on GPU.
•A thorough revision of Cellular Automata implementations in modern GPUs.•Study of which general aspects of GPU architectures influence in the performance of Cellular Automata implementations.•Novel techniques that improve memory bounded Cellular Automata performance in GPUs.•Experimental results on bi-dimensional CA to compare novel techniques of performance.</abstract><pub>Elsevier B.V</pub><doi>10.1016/j.simpat.2022.102519</doi><orcidid>https://orcid.org/0000-0002-2792-2844</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1569-190X |
ispartof | Simulation modelling practice and theory, 2022-07, Vol.118, p.102519, Article 102519 |
issn | 1569-190X 1878-1462 |
language | eng |
recordid | cdi_crossref_primary_10_1016_j_simpat_2022_102519 |
source | ScienceDirect Freedom Collection |
subjects | Cellular automata Graphics Processing Units Parallel computing Performance optimization Stencil computation |
title | Efficient simulation execution of cellular automata on GPU |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T01%3A56%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-elsevier_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Efficient%20simulation%20execution%20of%20cellular%20automata%20on%20GPU&rft.jtitle=Simulation%20modelling%20practice%20and%20theory&rft.au=Cagigas-Mu%C3%B1iz,%20Daniel&rft.date=2022-07&rft.volume=118&rft.spage=102519&rft.pages=102519-&rft.artnum=102519&rft.issn=1569-190X&rft.eissn=1878-1462&rft_id=info:doi/10.1016/j.simpat.2022.102519&rft_dat=%3Celsevier_cross%3ES1569190X22000259%3C/elsevier_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c352t-fe4a0a85a015ed5c303e5145b20df8332ae0dfc933ce0a94a788885a87dc5eef3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |