Loading…
Near-Data Processing in Memory Expander for DNN Acceleration on GPUs
We propose a near-data processing (NDP) architecture that exploits a memory expander with byte-addressable memory-semantic interconnect to accelerate memory-bound operations in deep neural networks (DNNs). Our architecture can execute NDP operations on the memory traffic from the GPU on-the-fly by e...
Saved in:
Published in: | IEEE computer architecture letters 2021-07, Vol.20 (2), p.171-174 |
---|---|
Main Authors: | , , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | We propose a near-data processing (NDP) architecture that exploits a memory expander with byte-addressable memory-semantic interconnect to accelerate memory-bound operations in deep neural networks (DNNs). Our architecture can execute NDP operations on the memory traffic from the GPU on-the-fly by employing bump-in-the-wire NDP logic between the off-chip link and memory controller. In addition, the memory-bound operations executed on the NDP unit can be effectively overlapped with compute-intensive operations executed on a GPU, even if the two operations have a dependency. Furthermore, the NDP offloading can be automatically done by the compiler without any code modification by deep learning practitioners. Our approach can achieve a 51% speedup for training VGG-16 with batch normalization. |
---|---|
ISSN: | 1556-6056 1556-6064 |
DOI: | 10.1109/LCA.2021.3126450 |