Loading…

Physis: an implicitly parallel programming model for stencil computations on large-scale GPU-accelerated supercomputers

This paper proposes a compiler-based programming framework that automatically translates user-written structured grid code into scalable parallel implementation code for GPU-equipped clusters. To enable such automatic translations, we design a small set of declarative constructs that allow the user...

Full description

Saved in:
Bibliographic Details
Main Authors: Maruyama, Naoya, Nomura, Tatsuo, Sato, Kento, Matsuoka, Satoshi
Format: Conference Proceeding
Language:English
Subjects:
Citations: Items that cite this one
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper proposes a compiler-based programming framework that automatically translates user-written structured grid code into scalable parallel implementation code for GPU-equipped clusters. To enable such automatic translations, we design a small set of declarative constructs that allow the user to express stencil computations in a portable and implicitly parallel manner. Our framework translates the user-written code into actual implementation code in CUDA for GPU acceleration and MPI for node-level parallelization with automatic optimizations such as computation and communication overlapping. We demonstrate the feasibility of such automatic translations by implementing several structured grid applications in our framework. Experimental results on the TSUBAME2.0 GPU-based supercomputer show that the performance is comparable as hand-written code and good strong and weak scalability up to 256 GPUs.
ISSN:2167-4329
2167-4337
DOI:10.1145/2063384.2063398