Loading…

An octree-based, cartesian navier–stokes solver for modern cluster architectures

Adaptive Cartesian mesh approaches have proven useful for multi-scale applications where particular features can be finely resolved within a large solution domain. Traditional patch-based mesh refinement has demonstrated widespread applicability across a range of problems, but can face performance c...

Full description

Saved in:
Bibliographic Details
Published in:The Journal of supercomputing 2022, Vol.78 (9), p.11409-11440
Main Authors: Jude, Dylan, Sitaraman, Jayanarayanan, Wissink, Andrew
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Adaptive Cartesian mesh approaches have proven useful for multi-scale applications where particular features can be finely resolved within a large solution domain. Traditional patch-based mesh refinement has demonstrated widespread applicability across a range of problems, but can face performance challenges when applied to very large cases with billions of grid points running on large-scale hybrid CPU/GPU architectures. This work investigates an octree-based method combined with traditional finite-difference algorithms specifically designed to execute structured mesh refinement applications efficiently on modern cluster architectures. The primary application of the approach is the solution of helicopter rotor aerodynamics, where it is desirable to resolve time-dependent, fine-scale tip vortices within a solution domain that encompasses the entire helicopter and extends several rotor diameters away. This work demonstrates the performance of the octree construction and balance algorithms to scale to billions of mesh cells. A canonical problem (convecting vortex) and two application problems (helicopter rotor simulations) verify and validate the performance and accuracy of the developed framework, Orchard, on CPU and GPU architectures. Scaling on CPUs and GPUs is demonstrated up to 140 Xeon sockets and 36 V100 GPUS, respectively. The solver on GPUs demonstrates an order-of-magnitude speedup over execution on traditional CPU cluster nodes.
ISSN:0920-8542
1573-0484
DOI:10.1007/s11227-022-04324-7