Loading…

Exploiting global structure for performance on clusters

Most parallel programming models for distributed-memory architectures are based on individual threads interacting via send and receive operations. We show that a more structured model, BSP, gains substantial performance improvements by exploiting the extra information implicit in its structure. In p...

Full description

Saved in:
Bibliographic Details
Main Authors: Donaldson, S.R., Hill, J.M.D., Skillicom, D.B.
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Most parallel programming models for distributed-memory architectures are based on individual threads interacting via send and receive operations. We show that a more structured model, BSP, gains substantial performance improvements by exploiting the extra information implicit in its structure. In particular each thread learns something about global state whenever it receives a message. This information can be used to modify its own behavior to improve collective use of the communication system. The programming model's semantics also provides implicit knowledge that can be exploited to increase performance. We show that these effects are useful at the application level by comparing the performance of BSP and MPI implementations of the NAS parallel benchmarks.
ISSN:1063-7133
DOI:10.1109/IPPS.1999.760455