Loading…
Parallelization of a Three-Dimensional Flow Solver for Euler Rotorcraft Aerodynamics Predictions
An approach for parallelizing the three-dimensional Euler/Navier-Stokes rotorcraft computational fluid dynamics flow solver transonic unsteady rotor Navier-Stokes (TURNS) is introduced. Parallelization is performed using a domain decomposition technique that is developed for distributed-memory paral...
Saved in:
Main Authors: | , , |
---|---|
Format: | Report |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | An approach for parallelizing the three-dimensional Euler/Navier-Stokes rotorcraft computational fluid dynamics flow solver transonic unsteady rotor Navier-Stokes (TURNS) is introduced. Parallelization is performed using a domain decomposition technique that is developed for distributed-memory parallel architectures. Communication between the subdomains on each processor is performed via message passing in the form of message passing interface subroutine calls. The most difficult portion of the TURNS algorithm to implement efficiently in parallel is the implicit time step using the lower-upper symmetric Gauss-Seidel (LU-SGS) algorithm. Two modifications of LU-SGS are proposed to improve the parallel performance. First, a previously introduced Jacobi-like method called data-parallel lower upper relaxation (DP-LUR) is used. Second, a new hybrid method is introduced that combines the Jacobi sweeping approach in DP-LUR for interprocessor communications and the symmetric Gauss-Seidel algorithm in LU-SGS for on-processor computations. The parallelized TURNS code with the modified implicit operator is implemented on two distributed-memory multiprocessor, the IBM SP2 and Thinking Machines CM-5, and used to compute the three-dimensional quasisteady and unsteady flowfield of a helicopter rotor in forward flight. Good parallel speedups with a low percentage of communication are exhibited by the code. The proposed hybrid algorithm requires less CPU time than DP-LUR while maintaining comparable parallel speedups and communication costs. Execution rates found on the IBM SP2 are impressive; on 114 processors of the SP2, the solution time of both quasisteady and unsteady calculations is reduced by a factor of about 12 over a single processor of the Cray C-90.
Published in the AIAA Journal, v34 n11 p2276-2283, Nov 1996. Prepared in collaboration with University of Minnesota, Minneapolis, MN and Purdue University, West Lafayette, IN. |
---|