Loading…
Development of a parallel CUDA algorithm for solving 3D guiding center problems
In this study, we develop a novel compute unified device architecture (CUDA) algorithm, which we call C-ECM3, for solving a three-dimensional (3D) guiding center problem. The C-ECM3 is a parallel algorithm for the iterative-free backward semi-Lagrangian method with third-order temporal accuracy (ECM...
Saved in:
Published in: | Computer physics communications 2022-07, Vol.276, p.108331, Article 108331 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In this study, we develop a novel compute unified device architecture (CUDA) algorithm, which we call C-ECM3, for solving a three-dimensional (3D) guiding center problem. The C-ECM3 is a parallel algorithm for the iterative-free backward semi-Lagrangian method with third-order temporal accuracy (ECM3). One well known challenge in speeding up a CUDA program is to efficiently design kernel functions that can optimally use hierarchical memory classified according to access speed. To solve this challenge, the C-ECM3 is mainly devoted to making a decomposition strategy for solving the tremendous number of generated Cauchy problems. The decomposition strategy divides the 9×9 linear system for each Cauchy problem in the ECM3 into two 3×3 linear systems, more solverable parts. In addition, the strategy explicitly solves these small systems using Cramer's rule. It turns out that the proposed C-ECM3 enables us to design an array-free kernel function that efficiently uses hierarchical memory. In addition, the C-ECM3 significantly reduces the run-time for tracing trajectories of particles compared to other graphics processing unit (GPU) programs that use the usual Gaussian algorithm. The Kelvin-Helmholtz instability and a 3D guiding center problem are simulated to demonstrate the numerical evidence for the C-ECM3. With these numerical experiments, we verify that the proposed C-ECM3 significantly improves computational speed compared to other methods while maintaining the accuracy of the CPU (central processing unit) version of ECM3. The validity of the C-ECM3 is also confirmed by showing that it satisfies Shoucri's analysis for Kelvin-Helmholtz instability. |
---|---|
ISSN: | 0010-4655 1879-2944 |
DOI: | 10.1016/j.cpc.2022.108331 |