Loading…
Synthesis of space optimal systolic arrays for band matrix-vector multiplication
In this paper, we consider the implementation of a product c = A b , where A is N 1 × N 3 band matrix with bandwidth ω and b is a vector of size N 3 ×1, on bidirectional and unidirectional linear systolic arrays (BLSA and ULSA, respectively). We distinguish the cases when the matrix bandwidth ω is 1...
Saved in:
Published in: | The Journal of supercomputing 2009-09, Vol.49 (3), p.269-290 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In this paper, we consider the implementation of a product
c
=
A
b
, where
A
is
N
1
×
N
3
band matrix with bandwidth
ω
and
b
is a vector of size
N
3
×1, on bidirectional and unidirectional linear systolic arrays (BLSA and ULSA, respectively). We distinguish the cases when the matrix bandwidth
ω
is 1≤
ω
≤
N
3
and
N
3
≤
ω
≤
N
1
+
N
3
−1. A modification of the systolic array synthesis procedure based on data dependencies and space-time transformations of data dependency graph is proposed. The modification enables obtaining both BLSA and ULSA with an optimal number of processing elements (PEs) regardless of the matrix bandwidth. The execution time of the synthesized arrays has been minimized. We derive explicit formulas for the synthesis of these arrays. The performances of the designed arrays are discussed and compared to the performances of the arrays obtained by the standard design procedure. |
---|---|
ISSN: | 0920-8542 1573-0484 |
DOI: | 10.1007/s11227-008-0241-x |