Loading…

Combining reduction with synchronization barrier on multi‐core processors

Summary With the rise of multi‐core processors with a large number of cores, the need for shared memory reduction that performs efficiently on a large number of cores is more pressing. Efficient shared memory reduction on these multi‐core processors will help share memory programs be more efficient....

Full description

Saved in:
Bibliographic Details
Published in:Concurrency and computation 2023-01, Vol.35 (1), p.n/a
Main Authors: Mohamed El Maarouf, Aboul‐Karim, Giraud, Luc, Guermouche, Abdou, Guignon, Thomas
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Summary With the rise of multi‐core processors with a large number of cores, the need for shared memory reduction that performs efficiently on a large number of cores is more pressing. Efficient shared memory reduction on these multi‐core processors will help share memory programs be more efficient. In this article, we propose a reduction method combined with a barrier method that uses SIMD read/write instructions to combine barrier signaling and reduction value to minimize memory/cache traffic between cores, thereby reducing barrier latency. We compare different barriers and reduction methods on three multi‐core processors and show that the proposed combining barrier/reduction methods are 4 and 3.5 times faster than respectively GCC 11.1 and Intel 21.2 OpenMP 4.5 reduction.
ISSN:1532-0626
1532-0634
DOI:10.1002/cpe.7402