Loading…

A Fully Digital SRAM-Based Four-Layer In-Memory Computing Unit Achieving Multiplication Operations and Results Store

The separation of memory and arithmetic logic unit (ALU) in the von Neumann computing architecture hinders the development of big data and high-performance computing. In-memory computing (IMC) as a new computation method significantly reduces the latency and power consumption of data processing. In...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on very large scale integration (VLSI) systems 2023-06, Vol.31 (6), p.1-13
Main Authors: Lin, Zhiting, Zhang, Shaoying, Jin, Qian, Xia, Jianping, Liu, Yunwei, Yu, Kefeng, Zheng, Jian, Xu, Xiaoming, Fan, Xing, Li, Ke, Tong, Zhongzhen, Wu, Xiulong, Lu, Wenjuan, Peng, Chunyu, Zhao, Qiang
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The separation of memory and arithmetic logic unit (ALU) in the von Neumann computing architecture hinders the development of big data and high-performance computing. In-memory computing (IMC) as a new computation method significantly reduces the latency and power consumption of data processing. In this study, we propose a fully digital static random access memory (SRAM)-based IMC architecture, which has the following advantages: 1) it simplifies multiplication to multicycle addition operations, reuses logic cells, and reduces hardware overhead; 2) by adding a pair of nMOS transistors to achieve internal write-back, the computational efficiency is improved, and at the same time, the final result of the multiplication can be stored locally, eliminating the need to read the computational result immediately; and 3) this scheme can be easily expanded to multiplication operations with different bit widths, which provides good scalability. A 4-kb SRAM-IMC macro chip is manufactured using the SMIC 55-nm technology to realize 4-bit multiplication, with an energy efficiency of 51.4 TOPS/W (0.9 V) and a throughput of 234.3 GOPS/mm ^{2} . The proposed multiplication-accumulation architecture is applied to a neural network, which achieves 98.7% accuracy with the Mixed National Institute of Standards and Technology database (MNIST) dataset.
ISSN:1063-8210
1557-9999
DOI:10.1109/TVLSI.2023.3266651