Loading…

Hybrid Convolution Architecture for Energy-Efficient Deep Neural Network Processing

This paper presents a convolution process and its hardware architecture for energy-efficient deep neural network (DNN) processing. A DNN in general consists of a number of convolutional layers, and the number of input features involved in the convolution of a shallow layer is larger than that of ker...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on circuits and systems. I, Regular papers Regular papers, 2021-05, Vol.68 (5), p.2017-2029
Main Authors: Kim, Suchang, Jo, Jihyuck, Park, In-Cheol
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper presents a convolution process and its hardware architecture for energy-efficient deep neural network (DNN) processing. A DNN in general consists of a number of convolutional layers, and the number of input features involved in the convolution of a shallow layer is larger than that of kernels. As the layer deepens, however, the number of input features decreases, while that of kernels increases. The previous convolution architectures developed for enhancing energy efficiency have tried to reduce the memory accesses by increasing the reuse of the data once accessed from the memory. However, redundant memory accesses are still required as the change in the numbers of data has not been considered. We propose a hybrid convolution process that selects either a kernel-stay or feature-stay process by taking into account the numbers of data, and a forwarding technique to further reduce the memory accesses needed to store and load partial sums. The proposed convolution process is effective in maximizing data reuse, leading to an energy-efficient hybrid convolution architecture. Compared to the state-of-the- art architectures, the proposed architecture enhances the energy efficiency by up to 2.38 times in a 65nm CMOS process.
ISSN:1549-8328
1558-0806
DOI:10.1109/TCSI.2021.3059882