Loading…

Efficient simulation execution of cellular automata on GPU

Graphics Processing Units (GPUs) can be used as convenient hardware accelerators to speed up Cellular Automata (CA) simulations, which are employed in many scientific areas. However, an important set of CA have performance constraints due to GPU memory bandwidth. Few studies have fully explored how...

Full description

Saved in:
Bibliographic Details
Published in:Simulation modelling practice and theory 2022-07, Vol.118, p.102519, Article 102519
Main Authors: Cagigas-Muñiz, Daniel, Diaz-del-Rio, Fernando, Sevillano-Ramos, Jose Luis, Guisado-Lizar, Jose-Luis
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c352t-fe4a0a85a015ed5c303e5145b20df8332ae0dfc933ce0a94a788885a87dc5eef3
cites cdi_FETCH-LOGICAL-c352t-fe4a0a85a015ed5c303e5145b20df8332ae0dfc933ce0a94a788885a87dc5eef3
container_end_page
container_issue
container_start_page 102519
container_title Simulation modelling practice and theory
container_volume 118
creator Cagigas-Muñiz, Daniel
Diaz-del-Rio, Fernando
Sevillano-Ramos, Jose Luis
Guisado-Lizar, Jose-Luis
description Graphics Processing Units (GPUs) can be used as convenient hardware accelerators to speed up Cellular Automata (CA) simulations, which are employed in many scientific areas. However, an important set of CA have performance constraints due to GPU memory bandwidth. Few studies have fully explored how CA implementations can take advantage of modern GPU architectures, mainly in the case of intensive memory usage. In this paper, we make a thorough study of techniques (stencil computing framework, look-up tables, and packet coding) to efficiently implement CA on GPU, taking into account its detailed architecture. Exhaustive experiments to validate these implementation techniques for a number of significant memory-bounded CA are performed. The CA analysed include the classical Game of Life, a Forest Fire model, a Cyclic cellular automaton, and the WireWorld CA. The experimental results show that implementations using the presented techniques can significantly outperform a baseline standard GPU implementation. The best performance results of all known implementations of memory bounded CA were obtained. Moreover, some of the techniques, like look-up tables or temporal blocking, are indeed relatively easy to implement or to apply when the transition rules are simple. Finally, detailed descriptions and discussions of the indicated techniques are included, which may be useful to practitioners interested in developing high performance simulations in efficient languages based on CA on GPU. •A thorough revision of Cellular Automata implementations in modern GPUs.•Study of which general aspects of GPU architectures influence in the performance of Cellular Automata implementations.•Novel techniques that improve memory bounded Cellular Automata performance in GPUs.•Experimental results on bi-dimensional CA to compare novel techniques of performance.
doi_str_mv 10.1016/j.simpat.2022.102519
format article
fullrecord <record><control><sourceid>elsevier_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1016_j_simpat_2022_102519</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S1569190X22000259</els_id><sourcerecordid>S1569190X22000259</sourcerecordid><originalsourceid>FETCH-LOGICAL-c352t-fe4a0a85a015ed5c303e5145b20df8332ae0dfc933ce0a94a788885a87dc5eef3</originalsourceid><addsrcrecordid>eNp9j09LxDAQxYMouK5-Aw_9Aq3507SpB0GWdRUW9OCCtzCmE0hpt0uSin57s1vPzmUeb3iP-RFyy2jBKKvuuiK44QCx4JTzZHHJmjOyYKpWOSsrfp60rJqcNfTjklyF0FHKlKrqBblfW-uMw33MUsfUQ3TjPsNvNNNJjTYz2Pfp4DOY4jhAhCz5m7fdNbmw0Ae8-dtLsntav6-e8-3r5mX1uM2NkDzmFkugoCRQJrGVRlCBkpXyk9PWKiE4YBKmEcIghaaEWqWRoOrWSEQrlqSce40fQ_Bo9cG7AfyPZlQf-XWnZ3595Nczf4o9zDFMv3059DocOQ22zqOJuh3d_wW_IfBmVA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Efficient simulation execution of cellular automata on GPU</title><source>ScienceDirect Freedom Collection</source><creator>Cagigas-Muñiz, Daniel ; Diaz-del-Rio, Fernando ; Sevillano-Ramos, Jose Luis ; Guisado-Lizar, Jose-Luis</creator><creatorcontrib>Cagigas-Muñiz, Daniel ; Diaz-del-Rio, Fernando ; Sevillano-Ramos, Jose Luis ; Guisado-Lizar, Jose-Luis</creatorcontrib><description>Graphics Processing Units (GPUs) can be used as convenient hardware accelerators to speed up Cellular Automata (CA) simulations, which are employed in many scientific areas. However, an important set of CA have performance constraints due to GPU memory bandwidth. Few studies have fully explored how CA implementations can take advantage of modern GPU architectures, mainly in the case of intensive memory usage. In this paper, we make a thorough study of techniques (stencil computing framework, look-up tables, and packet coding) to efficiently implement CA on GPU, taking into account its detailed architecture. Exhaustive experiments to validate these implementation techniques for a number of significant memory-bounded CA are performed. The CA analysed include the classical Game of Life, a Forest Fire model, a Cyclic cellular automaton, and the WireWorld CA. The experimental results show that implementations using the presented techniques can significantly outperform a baseline standard GPU implementation. The best performance results of all known implementations of memory bounded CA were obtained. Moreover, some of the techniques, like look-up tables or temporal blocking, are indeed relatively easy to implement or to apply when the transition rules are simple. Finally, detailed descriptions and discussions of the indicated techniques are included, which may be useful to practitioners interested in developing high performance simulations in efficient languages based on CA on GPU. •A thorough revision of Cellular Automata implementations in modern GPUs.•Study of which general aspects of GPU architectures influence in the performance of Cellular Automata implementations.•Novel techniques that improve memory bounded Cellular Automata performance in GPUs.•Experimental results on bi-dimensional CA to compare novel techniques of performance.</description><identifier>ISSN: 1569-190X</identifier><identifier>EISSN: 1878-1462</identifier><identifier>DOI: 10.1016/j.simpat.2022.102519</identifier><language>eng</language><publisher>Elsevier B.V</publisher><subject>Cellular automata ; Graphics Processing Units ; Parallel computing ; Performance optimization ; Stencil computation</subject><ispartof>Simulation modelling practice and theory, 2022-07, Vol.118, p.102519, Article 102519</ispartof><rights>2022 The Authors</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c352t-fe4a0a85a015ed5c303e5145b20df8332ae0dfc933ce0a94a788885a87dc5eef3</citedby><cites>FETCH-LOGICAL-c352t-fe4a0a85a015ed5c303e5145b20df8332ae0dfc933ce0a94a788885a87dc5eef3</cites><orcidid>0000-0002-2792-2844</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27923,27924</link.rule.ids></links><search><creatorcontrib>Cagigas-Muñiz, Daniel</creatorcontrib><creatorcontrib>Diaz-del-Rio, Fernando</creatorcontrib><creatorcontrib>Sevillano-Ramos, Jose Luis</creatorcontrib><creatorcontrib>Guisado-Lizar, Jose-Luis</creatorcontrib><title>Efficient simulation execution of cellular automata on GPU</title><title>Simulation modelling practice and theory</title><description>Graphics Processing Units (GPUs) can be used as convenient hardware accelerators to speed up Cellular Automata (CA) simulations, which are employed in many scientific areas. However, an important set of CA have performance constraints due to GPU memory bandwidth. Few studies have fully explored how CA implementations can take advantage of modern GPU architectures, mainly in the case of intensive memory usage. In this paper, we make a thorough study of techniques (stencil computing framework, look-up tables, and packet coding) to efficiently implement CA on GPU, taking into account its detailed architecture. Exhaustive experiments to validate these implementation techniques for a number of significant memory-bounded CA are performed. The CA analysed include the classical Game of Life, a Forest Fire model, a Cyclic cellular automaton, and the WireWorld CA. The experimental results show that implementations using the presented techniques can significantly outperform a baseline standard GPU implementation. The best performance results of all known implementations of memory bounded CA were obtained. Moreover, some of the techniques, like look-up tables or temporal blocking, are indeed relatively easy to implement or to apply when the transition rules are simple. Finally, detailed descriptions and discussions of the indicated techniques are included, which may be useful to practitioners interested in developing high performance simulations in efficient languages based on CA on GPU. •A thorough revision of Cellular Automata implementations in modern GPUs.•Study of which general aspects of GPU architectures influence in the performance of Cellular Automata implementations.•Novel techniques that improve memory bounded Cellular Automata performance in GPUs.•Experimental results on bi-dimensional CA to compare novel techniques of performance.</description><subject>Cellular automata</subject><subject>Graphics Processing Units</subject><subject>Parallel computing</subject><subject>Performance optimization</subject><subject>Stencil computation</subject><issn>1569-190X</issn><issn>1878-1462</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNp9j09LxDAQxYMouK5-Aw_9Aq3507SpB0GWdRUW9OCCtzCmE0hpt0uSin57s1vPzmUeb3iP-RFyy2jBKKvuuiK44QCx4JTzZHHJmjOyYKpWOSsrfp60rJqcNfTjklyF0FHKlKrqBblfW-uMw33MUsfUQ3TjPsNvNNNJjTYz2Pfp4DOY4jhAhCz5m7fdNbmw0Ae8-dtLsntav6-e8-3r5mX1uM2NkDzmFkugoCRQJrGVRlCBkpXyk9PWKiE4YBKmEcIghaaEWqWRoOrWSEQrlqSce40fQ_Bo9cG7AfyPZlQf-XWnZ3595Nczf4o9zDFMv3059DocOQ22zqOJuh3d_wW_IfBmVA</recordid><startdate>202207</startdate><enddate>202207</enddate><creator>Cagigas-Muñiz, Daniel</creator><creator>Diaz-del-Rio, Fernando</creator><creator>Sevillano-Ramos, Jose Luis</creator><creator>Guisado-Lizar, Jose-Luis</creator><general>Elsevier B.V</general><scope>6I.</scope><scope>AAFTH</scope><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-2792-2844</orcidid></search><sort><creationdate>202207</creationdate><title>Efficient simulation execution of cellular automata on GPU</title><author>Cagigas-Muñiz, Daniel ; Diaz-del-Rio, Fernando ; Sevillano-Ramos, Jose Luis ; Guisado-Lizar, Jose-Luis</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c352t-fe4a0a85a015ed5c303e5145b20df8332ae0dfc933ce0a94a788885a87dc5eef3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Cellular automata</topic><topic>Graphics Processing Units</topic><topic>Parallel computing</topic><topic>Performance optimization</topic><topic>Stencil computation</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Cagigas-Muñiz, Daniel</creatorcontrib><creatorcontrib>Diaz-del-Rio, Fernando</creatorcontrib><creatorcontrib>Sevillano-Ramos, Jose Luis</creatorcontrib><creatorcontrib>Guisado-Lizar, Jose-Luis</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>CrossRef</collection><jtitle>Simulation modelling practice and theory</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Cagigas-Muñiz, Daniel</au><au>Diaz-del-Rio, Fernando</au><au>Sevillano-Ramos, Jose Luis</au><au>Guisado-Lizar, Jose-Luis</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Efficient simulation execution of cellular automata on GPU</atitle><jtitle>Simulation modelling practice and theory</jtitle><date>2022-07</date><risdate>2022</risdate><volume>118</volume><spage>102519</spage><pages>102519-</pages><artnum>102519</artnum><issn>1569-190X</issn><eissn>1878-1462</eissn><abstract>Graphics Processing Units (GPUs) can be used as convenient hardware accelerators to speed up Cellular Automata (CA) simulations, which are employed in many scientific areas. However, an important set of CA have performance constraints due to GPU memory bandwidth. Few studies have fully explored how CA implementations can take advantage of modern GPU architectures, mainly in the case of intensive memory usage. In this paper, we make a thorough study of techniques (stencil computing framework, look-up tables, and packet coding) to efficiently implement CA on GPU, taking into account its detailed architecture. Exhaustive experiments to validate these implementation techniques for a number of significant memory-bounded CA are performed. The CA analysed include the classical Game of Life, a Forest Fire model, a Cyclic cellular automaton, and the WireWorld CA. The experimental results show that implementations using the presented techniques can significantly outperform a baseline standard GPU implementation. The best performance results of all known implementations of memory bounded CA were obtained. Moreover, some of the techniques, like look-up tables or temporal blocking, are indeed relatively easy to implement or to apply when the transition rules are simple. Finally, detailed descriptions and discussions of the indicated techniques are included, which may be useful to practitioners interested in developing high performance simulations in efficient languages based on CA on GPU. •A thorough revision of Cellular Automata implementations in modern GPUs.•Study of which general aspects of GPU architectures influence in the performance of Cellular Automata implementations.•Novel techniques that improve memory bounded Cellular Automata performance in GPUs.•Experimental results on bi-dimensional CA to compare novel techniques of performance.</abstract><pub>Elsevier B.V</pub><doi>10.1016/j.simpat.2022.102519</doi><orcidid>https://orcid.org/0000-0002-2792-2844</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1569-190X
ispartof Simulation modelling practice and theory, 2022-07, Vol.118, p.102519, Article 102519
issn 1569-190X
1878-1462
language eng
recordid cdi_crossref_primary_10_1016_j_simpat_2022_102519
source ScienceDirect Freedom Collection
subjects Cellular automata
Graphics Processing Units
Parallel computing
Performance optimization
Stencil computation
title Efficient simulation execution of cellular automata on GPU
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T01%3A56%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-elsevier_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Efficient%20simulation%20execution%20of%20cellular%20automata%20on%20GPU&rft.jtitle=Simulation%20modelling%20practice%20and%20theory&rft.au=Cagigas-Mu%C3%B1iz,%20Daniel&rft.date=2022-07&rft.volume=118&rft.spage=102519&rft.pages=102519-&rft.artnum=102519&rft.issn=1569-190X&rft.eissn=1878-1462&rft_id=info:doi/10.1016/j.simpat.2022.102519&rft_dat=%3Celsevier_cross%3ES1569190X22000259%3C/elsevier_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c352t-fe4a0a85a015ed5c303e5145b20df8332ae0dfc933ce0a94a788885a87dc5eef3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true