Loading…

Backtracking-based Load Balancing

High-productivity languages for parallel computing become more important as parallel environments including multicores become more common. Cilk is such a language. It provides good load balancing for many applications including irregular ones; that is, it keeps all workers busy by creating plenty of...

Full description

Saved in:
Bibliographic Details
Published in:ACM SIGPLAN notices 2009-02, Vol.44 (4), p.55-64
Main Authors: HIRAISHI, Tasuku, YASUGI, Masahiro, UMATANI, Seiji, YUASA, Taiichi
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:High-productivity languages for parallel computing become more important as parallel environments including multicores become more common. Cilk is such a language. It provides good load balancing for many applications including irregular ones; that is, it keeps all workers busy by creating plenty of "logical" threads and adopting the oldest-first work stealing strategy. This paper proposes a "logical thread"-free framework called Tascell , which achieves a higher performance and supports a wider range of parallel environments including clusters without loss of productivity. A Tascell worker spawns a "real" task only when requested by another idle worker. The worker performs the spawning by temporarily "backtracking" and restoring its oldest task-spawnable state. Our approach eliminates the cost of spawning/managing logical threads. It also promotes the reuse of workspaces and improves the locality of reference since it does not need to prepare a workspace for each concurrently runnable logical thread. Furthermore, Tascell enables elegant and highly-efficient backtrack search algorithms with delayed workspace copying. For instance, our 16-queens problem solver is 1.86 times faster than Cilk on a system with two dual-core processors. Our approach also enables a single program to run in both shared and distributed memory environments with reasonable efficiency and scalability.
ISSN:1523-2867
0362-1340
1558-1160
DOI:10.1145/1594835.1504187