Loading…

A Framework for the Analysis of Throughput-Constraints of SNNs on Neuromorphic Hardware

Spiking neural networks (SNN) are efficient computation models to infer spacio-temporal pattern recognition applications on neuromorphic hardware. Neuromorphic hardware are typically designed using interconnected crossbars, with each crossbar containing a structure of fully connected neurons. In ord...

Full description

Saved in:
Bibliographic Details
Main Authors: Balaji, Adarsha, Das, Anup
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Spiking neural networks (SNN) are efficient computation models to infer spacio-temporal pattern recognition applications on neuromorphic hardware. Neuromorphic hardware are typically designed using interconnected crossbars, with each crossbar containing a structure of fully connected neurons. In order to ensure application performance such as accuracy and system performance such as throughput and resource utilization, SNNs need to be efficiently mapped on neuromorphic hardware. To address this, we propose a design flow to partition and map SNN-based applications on neuromorphic hardware, with an aim to enhance application and system performance. The design flow operates in two steps : (1) a two-step clustering technique to partition trained SNNs into clusters of neurons and synapses, with an aim to minimize inter-cluster spike communication, (2) mapping and scheduling the clusters on to crossbars-based architectures, modeled using Synchronous Data-flow Graphs (SDFGs). The SDFG model incorporates hardware constraints such as I/O bandwidth of crossbars and synaptic memory while analyzing the throughput of the modeled system. Our design-flow integrates CARLsim, a GPU-accelerated application-level SNN simulator with SDF3, a tool to map SDFG on hardware. We evaluate the design-flow using synthetic and realistic SNN-based applications. We show that, for throughput constrained applications, we achieve a 21.74% and 15.03% reduction in memory usage and utilization of the time-multiplexed interconnect, compared to a state of the art approach.
ISSN:2159-3477
DOI:10.1109/ISVLSI.2019.00043