Loading…
A deadlock‐free lock‐based synchronization for GPUs
Summary Graphics Processing Units (GPUs) have evolved from pure graphics applications toward general purpose applications, often referred to as GPGPU computing. However, its scope is still limited to data‐parallel applications that require little synchronization. As synchronization on GPUs is quite...
Saved in:
Published in: | Concurrency and computation 2019-04, Vol.31 (7), p.n/a |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Summary
Graphics Processing Units (GPUs) have evolved from pure graphics applications toward general purpose applications, often referred to as GPGPU computing. However, its scope is still limited to data‐parallel applications that require little synchronization. As synchronization on GPUs is quite costly, synchronization requirements in GPUs are usually realized using existing synchronization primitives like atomic operations and barriers. These approaches either incur significant overhead or place certain restrictions in their usage, affecting the scalability/scope of such applications. The lack of adequate support for fine‐grained synchronization has restricted the realization of irregular algorithms on GPUs, wherein control flow and memory access patterns are data‐dependent and unpredictable. Recently, there has been an interest in building relationship between lock‐step semantics and interleaving semantics and to develop lock‐based synchronization mechanism for GPUs to overcome these issues. GPUs follow SIMD, and hence, when adapted for general purpose computing, new distinct deadlock scenarios arise. In this paper, we discuss various deadlock scenarios that can happen in GPUs, and present a modeling of deadlocks in GPUs. We shall first illustrate such deadlock scenarios in GPU applications, and then describe a novel lock‐based deadlock‐free, fine‐grained synchronization mechanism for GPU architectures that overcomes deadlocks without a significant overhead. We further establish the correctness of our methods and discuss the performance overheads. |
---|---|
ISSN: | 1532-0626 1532-0634 |
DOI: | 10.1002/cpe.4991 |