Loading…

A deadlock‐free lock‐based synchronization for GPUs

Summary Graphics Processing Units (GPUs) have evolved from pure graphics applications toward general purpose applications, often referred to as GPGPU computing. However, its scope is still limited to data‐parallel applications that require little synchronization. As synchronization on GPUs is quite...

Full description

Saved in:
Bibliographic Details
Published in:Concurrency and computation 2019-04, Vol.31 (7), p.n/a
Main Authors: Anand, Anshu S, Srivastava, Akash, Shyamasundar, R.K.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Summary Graphics Processing Units (GPUs) have evolved from pure graphics applications toward general purpose applications, often referred to as GPGPU computing. However, its scope is still limited to data‐parallel applications that require little synchronization. As synchronization on GPUs is quite costly, synchronization requirements in GPUs are usually realized using existing synchronization primitives like atomic operations and barriers. These approaches either incur significant overhead or place certain restrictions in their usage, affecting the scalability/scope of such applications. The lack of adequate support for fine‐grained synchronization has restricted the realization of irregular algorithms on GPUs, wherein control flow and memory access patterns are data‐dependent and unpredictable. Recently, there has been an interest in building relationship between lock‐step semantics and interleaving semantics and to develop lock‐based synchronization mechanism for GPUs to overcome these issues. GPUs follow SIMD, and hence, when adapted for general purpose computing, new distinct deadlock scenarios arise. In this paper, we discuss various deadlock scenarios that can happen in GPUs, and present a modeling of deadlocks in GPUs. We shall first illustrate such deadlock scenarios in GPU applications, and then describe a novel lock‐based deadlock‐free, fine‐grained synchronization mechanism for GPU architectures that overcomes deadlocks without a significant overhead. We further establish the correctness of our methods and discuss the performance overheads.
ISSN:1532-0626
1532-0634
DOI:10.1002/cpe.4991