Loading…
DiBA: n-Dimensional Bitslice Architecture for LSTM Implementation
A hardware architecture for the implementation of LSTM neural networks that can be sized to the specific size of the problem is proposed here. Implementation of an LSTM application requires iteration of multiplications, additions, and the activation functions that operate on the stream of data input...
Saved in:
Main Authors: | , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | A hardware architecture for the implementation of LSTM neural networks that can be sized to the specific size of the problem is proposed here. Implementation of an LSTM application requires iteration of multiplications, additions, and the activation functions that operate on the stream of data inputs. To handle the iterations, the concept of bitslicing is done to cascade enough slices for an optimum performance depending on the problem size. In order to avoid a large linear array of MAC slices, which would require large adders, these slices are arranged into an n-dimensional structure. Such a structure forces the adder units to become slices of their own, which also operate concurrent with the rest of the hardware in a pipeline fashion. This paper presents this bitslice architecture that can become a fabric for a programmable general-purpose LSTM implementation. The paper also shows an FPGA implementation that uses an on-chip FPGA RAM for the LSTM required memory. The work is compared with other works not considering multidimensional structures, as well as one that considers multi-dimensional cascading. In both cases we show that our structure is faster and uses smaller adder structures. |
---|---|
ISSN: | 2473-2117 |
DOI: | 10.1109/DDECS50862.2020.9095614 |