Loading…

Fine‐grain task‐parallel algorithms for matrix factorizations and inversion on many‐threaded CPUs

We extend a two‐level task partitioning previously applied to the inversion of dense matrices via Gauss–Jordan elimination to the more challenging QR factorization as well as the initial orthogonal reduction to band form found in the singular value decomposition. Our new task‐parallel algorithms lev...

Full description

Saved in:
Bibliographic Details
Published in:Concurrency and computation 2023-12, Vol.35 (27)
Main Authors: Catalán, Sandra, Herrero, José R., Igual, Francisco D., Quintana‐Ortí, Enrique S., Rodríguez‐Sánchez, Rafael
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:We extend a two‐level task partitioning previously applied to the inversion of dense matrices via Gauss–Jordan elimination to the more challenging QR factorization as well as the initial orthogonal reduction to band form found in the singular value decomposition. Our new task‐parallel algorithms leverage the tasking mechanism currently available in OpenMP to exploit “nested” task parallelism, with a first outer level that operates on matrix panels and a second inner level that processes the matrix either by ‐panels or by tiles, in order to expose a large number of independent tasks. We present a detailed performance analysis, including execution traces, which shows that the two‐level refinement into fine grain tasks allows for an improved load balancing and delivers high performance on current general‐purpose many‐core processors (CPUs) from Intel and AMD.
ISSN:1532-0626
1532-0634
DOI:10.1002/cpe.6999