Loading…
Algorithmic choices in WARP – A framework for continuous energy Monte Carlo neutron transport in general 3D geometries on GPUs
•WARP, a GPU-accelerated Monte Carlo neutron transport code, has been developed.•The NVIDIA OptiX high-performance ray tracing library is used to process geometric data.•The unionized cross section representation is modified for higher performance.•Reference remapping is used to keep the GPU busy as...
Saved in:
Published in: | Annals of nuclear energy 2015-03, Vol.77 (C), p.176-193 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | •WARP, a GPU-accelerated Monte Carlo neutron transport code, has been developed.•The NVIDIA OptiX high-performance ray tracing library is used to process geometric data.•The unionized cross section representation is modified for higher performance.•Reference remapping is used to keep the GPU busy as neutron batch population reduces.•Reference remapping is done using a key-value radix sort on neutron reaction type.
In recent supercomputers, general purpose graphics processing units (GPGPUs) are a significant faction of the supercomputer’s total computational power. GPGPUs have different architectures compared to central processing units (CPUs), and for Monte Carlo neutron transport codes used in nuclear engineering to take advantage of these coprocessor cards, transport algorithms must be changed to execute efficiently on them. WARP is a continuous energy Monte Carlo neutron transport code that has been written to do this. The main thrust of WARP is to adapt previous event-based transport algorithms to the new GPU hardware; the algorithmic choices for all parts of which are presented in this paper. It is found that remapping history data references increases the GPU processing rate when histories start to complete. The main reason for this is that completed data are eliminated from the address space, threads are kept busy, and memory bandwidth is not wasted on checking completed data. Remapping also allows the interaction kernels to be launched concurrently, improving efficiency. The OptiX ray tracing framework and CUDPP library are used for geometry representation and parallel dataset-side operations, ensuring high performance and reliability. |
---|---|
ISSN: | 0306-4549 1873-2100 |
DOI: | 10.1016/j.anucene.2014.10.039 |