Loading…

Development of a parallel CUDA algorithm for solving 3D guiding center problems

In this study, we develop a novel compute unified device architecture (CUDA) algorithm, which we call C-ECM3, for solving a three-dimensional (3D) guiding center problem. The C-ECM3 is a parallel algorithm for the iterative-free backward semi-Lagrangian method with third-order temporal accuracy (ECM...

Full description

Saved in:
Bibliographic Details
Published in:Computer physics communications 2022-07, Vol.276, p.108331, Article 108331
Main Authors: Bak, Soyoon, Kim, Philsu, Park, Sangbeom
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this study, we develop a novel compute unified device architecture (CUDA) algorithm, which we call C-ECM3, for solving a three-dimensional (3D) guiding center problem. The C-ECM3 is a parallel algorithm for the iterative-free backward semi-Lagrangian method with third-order temporal accuracy (ECM3). One well known challenge in speeding up a CUDA program is to efficiently design kernel functions that can optimally use hierarchical memory classified according to access speed. To solve this challenge, the C-ECM3 is mainly devoted to making a decomposition strategy for solving the tremendous number of generated Cauchy problems. The decomposition strategy divides the 9×9 linear system for each Cauchy problem in the ECM3 into two 3×3 linear systems, more solverable parts. In addition, the strategy explicitly solves these small systems using Cramer's rule. It turns out that the proposed C-ECM3 enables us to design an array-free kernel function that efficiently uses hierarchical memory. In addition, the C-ECM3 significantly reduces the run-time for tracing trajectories of particles compared to other graphics processing unit (GPU) programs that use the usual Gaussian algorithm. The Kelvin-Helmholtz instability and a 3D guiding center problem are simulated to demonstrate the numerical evidence for the C-ECM3. With these numerical experiments, we verify that the proposed C-ECM3 significantly improves computational speed compared to other methods while maintaining the accuracy of the CPU (central processing unit) version of ECM3. The validity of the C-ECM3 is also confirmed by showing that it satisfies Shoucri's analysis for Kelvin-Helmholtz instability.
ISSN:0010-4655
1879-2944
DOI:10.1016/j.cpc.2022.108331