Loading…

Scalable matrix decompositions with multiple cores on FPGAs

Hardware accelerators are getting increasingly important in heterogeneous systems for many applications, including those that employ matrix decompositions. In recent years, a class of tiled matrix decomposition algorithms has been proposed for out-of-memory computations and multi-core architectures...

Full description

Saved in:
Bibliographic Details
Published in:Microprocessors and microsystems 2013-11, Vol.37 (8), p.887-898
Main Authors: Tai, Yi-Gang, Dan Lo, Chia-Tien, Psarris, Kleanthis
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Hardware accelerators are getting increasingly important in heterogeneous systems for many applications, including those that employ matrix decompositions. In recent years, a class of tiled matrix decomposition algorithms has been proposed for out-of-memory computations and multi-core architectures including GPU-based heterogeneous systems. However, on FPGAs these scalable solutions for large matrices are rarely found. In this paper we use the latest tiled decomposition algorithms from high performance linear algebra for off-chip memory access and loop mapping on multiple processing cores for on-chip computation to perform scalable and high performance QR and LU matrix decompositions on FPGAs.
ISSN:0141-9331
1872-9436
DOI:10.1016/j.micpro.2012.06.008