Loading…

Backtracking-based Load Balancing

High-productivity languages for parallel computing become more important as parallel environments including multicores become more common. Cilk is such a language. It provides good load balancing for many applications including irregular ones; that is, it keeps all workers busy by creating plenty of...

Full description

Saved in:

Bibliographic Details
Published in:	ACM SIGPLAN notices 2009-02, Vol.44 (4), p.55-64
Main Authors:	HIRAISHI, Tasuku, YASUGI, Masahiro, UMATANI, Seiji, YUASA, Taiichi
Format:	Article
Language:	English
Subjects:	Applied sciences Computer science control theory systems Computer systems and distributed systems. User interface Exact sciences and technology Information systems. Data bases Memory organisation. Data processing Software
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	High-productivity languages for parallel computing become more important as parallel environments including multicores become more common. Cilk is such a language. It provides good load balancing for many applications including irregular ones; that is, it keeps all workers busy by creating plenty of "logical" threads and adopting the oldest-first work stealing strategy. This paper proposes a "logical thread"-free framework called Tascell , which achieves a higher performance and supports a wider range of parallel environments including clusters without loss of productivity. A Tascell worker spawns a "real" task only when requested by another idle worker. The worker performs the spawning by temporarily "backtracking" and restoring its oldest task-spawnable state. Our approach eliminates the cost of spawning/managing logical threads. It also promotes the reuse of workspaces and improves the locality of reference since it does not need to prepare a workspace for each concurrently runnable logical thread. Furthermore, Tascell enables elegant and highly-efficient backtrack search algorithms with delayed workspace copying. For instance, our 16-queens problem solver is 1.86 times faster than Cilk on a system with two dual-core processors. Our approach also enables a single program to run in both shared and distributed memory environments with reasonable efficiency and scalability.
ISSN:	1523-2867 0362-1340 1558-1160
DOI:	10.1145/1594835.1504187