Loading…

Argobots: A Lightweight Low-Level Threading and Tasking Framework

In the past few decades, a number of user-level threading and tasking models have been proposed in the literature to address the shortcomings of OS-level threads, primarily with respect to cost and flexibility. Current state-of-the-art user-level threading and tasking models, however, are either too...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on parallel and distributed systems 2018-03, Vol.29 (3)
Main Authors: Seo, Sangmin, Amer, Abdelhalim, Balaji, Pavan, Bordage, Cyril, Bosilca, George, Brooks, Alex, Carns, Philip, Castello, Adrian, Genet, Damien, Herault, Thomas, Iwasaki, Shintaro, Jindal, Prateek, Kale, Laxmikant V., Krishnamoorthy, Sriram, Lifflander, Jonathan, Lu, Huiwei, Meneses, Esteban, Snir, Marc, Sun, Yanhua, Taura, Kenjiro, Beckman, Pete
Format: Article
Language:English
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In the past few decades, a number of user-level threading and tasking models have been proposed in the literature to address the shortcomings of OS-level threads, primarily with respect to cost and flexibility. Current state-of-the-art user-level threading and tasking models, however, are either too specific to applications or architectures or are not as powerful or flexible. In this paper, we present Argobots, a lightweight, low-level threading and tasking framework that is designed as a portable and performant substrate for high-level programming models or runtime systems. Argobots offers a carefully designed execution model that balances generality of functionality with providing a rich set of controls to allow specialization by the user or high-level programming model. We describe the design, implementation, and optimization of Argobots and present integrations with three example high-level models: OpenMP, MPI, and co-located I/O service. Evaluations show that (1) Argobots outperforms existing generic threading runtimes; (2) our OpenMP runtime offers more efficient interoperability capabilities than production OpenMP runtimes do; (3) when MPI interoperates with Argobots instead of Pthreads, it enjoys reduced synchronization costs and better latency hiding capabilities; and (4) I/O service with Argobots reduces interference with co-located applications, achieving performance competitive with that of the Pthreads version.
ISSN:1045-9219
1558-2183
DOI:10.1109/TPDS.2017.2766062