Loading…

Parallel SPH modeling using dynamic domain decomposition and load balancing displacement of Voronoi subdomains

A highly adaptive load balancing algorithm for parallel simulations using particle methods, such as molecular dynamics and smoothed particle hydrodynamics (SPH), is developed. Our algorithm is based on the dynamic spatial decomposition of simulated material samples between Voronoi subdomains, where...

Full description

Saved in:
Bibliographic Details
Published in:Computer physics communications 2019-01, Vol.234, p.112-125
Main Authors: Egorova, M.S., Dyachkov, S.A., Parshikov, A.N., Zhakhovsky, V.V.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:A highly adaptive load balancing algorithm for parallel simulations using particle methods, such as molecular dynamics and smoothed particle hydrodynamics (SPH), is developed. Our algorithm is based on the dynamic spatial decomposition of simulated material samples between Voronoi subdomains, where each subdomain with all its particles is handled by a single computational process which is typically run on a single CPU core of a multiprocessor computing cluster. The algorithm displaces the positions of neighbor Voronoi subdomains in accordance with the local load imbalance between the corresponding processes. It results in particle transfers from heavy-loaded processes to less-loaded ones. Iteration of the algorithm puts into alignment the processor loads. Convergence to a well-balanced decomposition from imbalanced one is improved by the usage of multi-body terms in the balancing displacements. The high adaptability of the balancing algorithm to simulation conditions is illustrated by SPH modeling of the dynamic behavior of materials under extreme conditions, which are characterized by large pressure and velocity gradients, as a result of which the spatial distribution of particles varies greatly in time. The higher parallel efficiency of our algorithm in such conditions is demonstrated by comparison with the corresponding static decomposition of the computational domain. Our algorithm shows almost perfect strong scalability in tests using from tens to several thousand processes.
ISSN:0010-4655
1879-2944
DOI:10.1016/j.cpc.2018.07.019