Loading…

Quantifying Data Locality in Dynamic Parallelism in GPUs

Dynamic parallelism (DP) is a new feature of emerging GPUs that allows new kernels to be generated and scheduled from the deviceside (GPU) without the host-side (CPU) intervention. To eiciently support DP, one of the major challenges is to saturate the GPU processing elements and provide them with t...

Full description

Saved in:

Bibliographic Details
Published in:	Performance evaluation review 2019-12, Vol.47 (1), p.25-26
Main Authors:	Tang, Xulong, Pattnaik, Ashutosh, Kayiran, Onur, Jog, Adwait, Kandemir, Mahmut Taylan, Das, Chita
Format:	Article
Language:	English
Subjects:	Computer systems organization Computer systems organization / Architectures Computer systems organization / Architectures / Parallel architectures Computer systems organization / Architectures / Parallel architectures / Single instruction, multiple data Computing methodologies Computing methodologies / Computer graphics Computing methodologies / Computer graphics / Graphics systems and interfaces Computing methodologies / Computer graphics / Graphics systems and interfaces / Graphics processors Computing methodologies / Parallel computing methodologies Software and its engineering Software and its engineering / Software notations and tools Software and its engineering / Software notations and tools / Compilers Software and its engineering / Software organization and properties Software and its engineering / Software organization and properties / Contextual software domains Software and its engineering / Software organization and properties / Contextual software domains / Operating systems Software and its engineering / Software organization and properties / Contextual software domains / Operating systems / Process management Software and its engineering / Software organization and properties / Contextual software domains / Operating systems / Process management / Scheduling
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Dynamic parallelism (DP) is a new feature of emerging GPUs that allows new kernels to be generated and scheduled from the deviceside (GPU) without the host-side (CPU) intervention. To eiciently support DP, one of the major challenges is to saturate the GPU processing elements and provide them with the required data in a timely fashion. In this paper, we irst conduct a limit study on the performance improvements that can be achieved by hardware schedulers that are provided with accurate data reuse information. We next propose LASER, a Locality-Aware SchedulER, where the hardware schedulers employ data reuse monitors to help make scheduling decisions to improve data locality at runtime. Experimental results on 16 benchmarks show that LASER, on an average, can improve performance by 11.3%.
ISSN:	0163-5999
DOI:	10.1145/3376930.3376947