Loading…
CorTile: A Scalable Neuromorphic Processing Core for Cortical Simulation With Hybrid-Mode Router and TCAM
In neuromorphic processors, simulating large-scale Spiking Neural Networks (SNNs) for cortical models necessitates a significant increase in communication traffic and memory capacity, due to the lack of exploiting the sparsity of connections. Therefore, this paper proposes CorTile, a scalable neurom...
Saved in:
Published in: | IEEE transactions on circuits and systems. I, Regular papers Regular papers, 2024-12, Vol.71 (12), p.5432-5444 |
---|---|
Main Authors: | , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In neuromorphic processors, simulating large-scale Spiking Neural Networks (SNNs) for cortical models necessitates a significant increase in communication traffic and memory capacity, due to the lack of exploiting the sparsity of connections. Therefore, this paper proposes CorTile, a scalable neuromorphic processing core designed for cortical simulation. We propose a hybrid-mode router that supports Remote Unicast and Local Broadcast (RULB) routing method, leveraging the high local connectivity and low distal connectivity observed in cortical models. This approach achieves reductions of 36.7% in average router load, 40.7% in peak load, 51.2% in average link traffic, 41.7% in peak traffic, respectively, compared to conventional routing methods. Additionally, the proposed Ternary Content Addressable Memory (TCAM)-based Sparse Connection Memory (TSCM) architecture leads to 87.1% reduction in area and a 62.7% reduction in power consumption. These approaches effectively decrease communication traffic and mitigate the quadratic increase in memory requirements, achieving linear growth instead, thus achieving scalability. The proposed CorTile is simulated using UMC 40-nm CMOS process, occupying an area of 5.15 mm2, supporting a maximum of 8k neurons and 64M synapses. Evaluated using a typical macaque cortex model, it consumes 8.25 mW, with the router operating at 200 MHz and the other modules at 100 MHz. This design achieves an average router load of 12.33 Mpackets/s and peak link traffic of 21.16 MB/s. Thanks to the scalability of the proposed processing core that can be tiled into many-core processors, it paves the way for chiplets and multiple chip integration towards a brain-scale neuromorphic computing system. |
---|---|
ISSN: | 1549-8328 1558-0806 |
DOI: | 10.1109/TCSI.2024.3431036 |