Loading…
An octree-based, cartesian navier–stokes solver for modern cluster architectures
Adaptive Cartesian mesh approaches have proven useful for multi-scale applications where particular features can be finely resolved within a large solution domain. Traditional patch-based mesh refinement has demonstrated widespread applicability across a range of problems, but can face performance c...
Saved in:
Published in: | The Journal of supercomputing 2022, Vol.78 (9), p.11409-11440 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Adaptive Cartesian mesh approaches have proven useful for multi-scale applications where particular features can be finely resolved within a large solution domain. Traditional patch-based mesh refinement has demonstrated widespread applicability across a range of problems, but can face performance challenges when applied to very large cases with billions of grid points running on large-scale hybrid CPU/GPU architectures. This work investigates an octree-based method combined with traditional finite-difference algorithms specifically designed to execute structured mesh refinement applications efficiently on modern cluster architectures. The primary application of the approach is the solution of helicopter rotor aerodynamics, where it is desirable to resolve time-dependent, fine-scale tip vortices within a solution domain that encompasses the entire helicopter and extends several rotor diameters away. This work demonstrates the performance of the octree construction and balance algorithms to scale to billions of mesh cells. A canonical problem (convecting vortex) and two application problems (helicopter rotor simulations) verify and validate the performance and accuracy of the developed framework, Orchard, on CPU and GPU architectures. Scaling on CPUs and GPUs is demonstrated up to 140 Xeon sockets and 36 V100 GPUS, respectively. The solver on GPUs demonstrates an order-of-magnitude speedup over execution on traditional CPU cluster nodes. |
---|---|
ISSN: | 0920-8542 1573-0484 |
DOI: | 10.1007/s11227-022-04324-7 |