Loading…

On expected and observed communication performance with MPI derived datatypes

•A framework for exploring communication performance with MPI derived datatypes.•Performance guidelines for derived datatypes.•A thorough, experimental exploration structured by performance guidelines of datatype communication performance in four current MPI libraries on two different systems. We ar...

Full description

Saved in:
Bibliographic Details
Published in:Parallel computing 2017-11, Vol.69, p.98-117
Main Authors: Carpen-Amarie, Alexandra, Hunold, Sascha, Träff, Jesper Larsson
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•A framework for exploring communication performance with MPI derived datatypes.•Performance guidelines for derived datatypes.•A thorough, experimental exploration structured by performance guidelines of datatype communication performance in four current MPI libraries on two different systems. We are interested in the cost of communicating simple, common, non-contiguous data layouts in various scenarios using the MPI derived datatype mechanism. Our aim is twofold. First, we provide a framework for studying communication performance for non-contiguous data layouts described with MPI derived datatypes in comparison to baseline performance with the same amount of contiguously stored data. Second, we explicate natural expectations on derived datatype communication performance that any MPI library implementation should arguably fulfill. These expectations are stated semi-formally as MPI datatype performance guidelines. Using our framework, we examine several MPI libraries on two different systems. Our findings are in many ways surprising and disappointing. First, using derived datatypes as intended by the MPI standard sometimes performs worse than the semantically equivalent packing and unpacking with the corresponding MPI functionality followed by contiguous communication. Second, communication performance with a single, contiguous datatype can be significantly worse than a repetition of its constituent datatype. Third, the heuristics that are typically employed by MPI libraries at type-commit time turn out to be insufficient to enforce the performance guidelines, showing room for better algorithms and heuristics for representing and processing derived datatypes in MPI libraries. In particular, we show cases where all MPI type constructors are necessary to achieve the expected performance. Our findings provide useful information to MPI library implementers, and hints to application programmers on good use of derived datatypes. Improved MPI libraries can be validated using our framework and approach.
ISSN:0167-8191
1872-7336
DOI:10.1016/j.parco.2017.08.006