Loading…
Static Scheduling of Weight Programming for DNN Acceleration with Resource Constrained PIM
Most existing architectural studies on ReRAM-based processing-in-memory (PIM) DNN accelerators assume that all weights of the DNN can be mapped to the crossbar at once. However, these studies are over-idealized. ReRAM crossbar resources for calculation are limited because of technological limitation...
Saved in:
Published in: | ACM transactions on embedded computing systems 2024-11, Vol.23 (6), p.1-22, Article 89 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Most existing architectural studies on ReRAM-based processing-in-memory (PIM) DNN accelerators assume that all weights of the DNN can be mapped to the crossbar at once. However, these studies are over-idealized. ReRAM crossbar resources for calculation are limited because of technological limitations, so multiple weight mapping procedures are required during the inference process. In this article, we propose a static scheduling framework which generates the mapping between DNN weights and ReRAM cells with minimum runtime weight programming cost. We first build a ReRAM crossbar programming latency model by simultaneously considering the DNN weight patterns, ReRAM programming operations, and PIM architecture characteristics. Then, the model is used in the searching process to obtain an optimized weight-to-OU mapping table with minimum online programming latency. Finally, an OU scheduler is used to coordinate the activation sequences of OUs in the crossbars to perform the inference computation correctly. Evaluation results show the proposed framework significantly reduces the weight programming overhead and the overall inference latency for various DNN models with different input datasets. |
---|---|
ISSN: | 1539-9087 1558-3465 |
DOI: | 10.1145/3615657 |