Loading…
Efficient Processing of MLPerf Mobile Workloads Using Digital Compute-In-Memory Macros
Compute-In-Memory (CIM) has recently emerged as a promising design paradigm to accelerate Deep Neural Network (DNN) processing. Continuously better energy-and area-efficiency at the macro level had been reported through many testchips over the last few years. However, in those macro design-oriented...
Saved in:
Published in: | IEEE transactions on computer-aided design of integrated circuits and systems 2024-04, Vol.43 (4), p.1-1 |
---|---|
Main Authors: | , , , , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Compute-In-Memory (CIM) has recently emerged as a promising design paradigm to accelerate Deep Neural Network (DNN) processing. Continuously better energy-and area-efficiency at the macro level had been reported through many testchips over the last few years. However, in those macro design-oriented studies, accelerator-level considerations such as memory accesses and processing of entire DNN workloads have not been investigated in-depth. In this paper, we aim to fill this gap starting with the characteristics of our latest CIM macro fabricated with cutting-edge FinFET CMOS technology at 4 nm node. We then study, through an accelerator simulator developed in-house, three key items that would determine the efficiency of our CIM macro in the accelerator context while running MLPerf Mobile suite: 1) dataflow optimization, 2) optimal selection of CIM macro dimensions to further improve macro utilization, and 3) optimal combination of multiple CIM macros. Although there is typically a stark contrast between macro-level peak and accelerator-level average throughput and energy efficiency, the aforementioned optimizations are shown to improve the macro utilization by 3.04× and reduce the Energy-Delay Product (EDP) to 0.34× compared to the original macro on MLPerf Mobile inference workloads. While we exploit a digital CIM macro in this study, the findings and proposed methods remain valid for other types of CIM (such as analog CIM and analog-digital-hybrid CIM) as well. |
---|---|
ISSN: | 0278-0070 1937-4151 |
DOI: | 10.1109/TCAD.2023.3333290 |