Loading…

A Simulation Framework to Automatically Analyze the Communication-Computation Overlap in Scientific Applications

Overlapping communication and computation has been devised as an attractive technique to alleviate the huge application's network requirements at large scale. Overlapping will allow to fully or partially hide the long communication delays suffered when transferring messages through the network....

Full description

Saved in:
Bibliographic Details
Main Authors: Subotic, Vladimir, Sancho, Jose Carlos, Labarta, Jesus, Valero, Mateo
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Overlapping communication and computation has been devised as an attractive technique to alleviate the huge application's network requirements at large scale. Overlapping will allow to fully or partially hide the long communication delays suffered when transferring messages through the network. This will relax the application's network requirements, and hence allow to deploy more cost-effective network designs. However, today's scientific applications make little use of overlapping. In addition, there is no support to analyze how overlap could impact the performance of real scientific applications. In this paper we address this issue by presenting a simulation framework to automatically analyze the benefits of communication-computation overlap. The simulation framework consists of a binary translation tool (Valgrind), a distributed machine simulator (Dimemas), and a visualization tool (Paraver). Valgrind instruments the legacy MPI application and generates the execution traces, then Dimemas uses the obtained traces and reconstructs the application's time-behavior on a configurable parallel platform, and finally Paraver visualizes the obtained time-behaviors. Our simulation methodology brings two new features into the study of overlap: 1) automatic simulation of the overlapped execution - as there is no need for code restructuring in applications; and 2) visualization of simulated time behaviors, that further allows useful comparisons of the non-overlapped and the overlapped executions.
ISSN:1552-5244
2168-9253
DOI:10.1109/CLUSTER.2010.33