Loading…
Reparallelization techniques for migrating OpenMP codes in computational grids
Typical computational grid users target only a single cluster and have to estimate the runtime of their jobs. Job schedulers prefer short‐running jobs to maintain a high system utilization. If the user underestimates the runtime, premature termination causes computation loss; overestimation is penal...
Saved in:
Published in: | Concurrency and computation 2009-03, Vol.21 (3), p.281-299 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Typical computational grid users target only a single cluster and have to estimate the runtime of their jobs. Job schedulers prefer short‐running jobs to maintain a high system utilization. If the user underestimates the runtime, premature termination causes computation loss; overestimation is penalized by long queue times. As a solution, we present an automatic reparallelization and migration of OpenMP applications. A reparallelization is dynamically computed for an OpenMP work distribution when the number of CPUs changes. The application can be migrated between clusters when an allocated time slice is exceeded. Migration is based on a coordinated, heterogeneous checkpointing algorithm. Both reparallelization and migration enable the user to freely use computing time at more than a single point of the grid. Our demo applications successfully adapt to the changed CPU setting and smoothly migrate between, for example, clusters in Erlangen, Germany, and Amsterdam, the Netherlands, that use different kinds and numbers of processors. Benchmarks show that reparallelization and migration impose average overheads of about 4 and 2%, respectively. Copyright © 2008 John Wiley & Sons, Ltd. |
---|---|
ISSN: | 1532-0626 1532-0634 |
DOI: | 10.1002/cpe.1356 |