Loading…
3.53-TOPS/W EEAIP: An Energy-Efficient Artificial Intelligence Hardware Architecture for Edge AI Applications
Artificial intelligence in the Internet of Things (AIoT) is a promising technology for consumer electronics. Battery life and package size are essential constraints for AI applications on edge devices. Thus, an efficient hardware architecture is important to support deep neural network (DNN) AI algo...
Saved in:
Published in: | IEEE transactions on consumer electronics 2024-02, Vol.70 (1), p.4333-4344 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Artificial intelligence in the Internet of Things (AIoT) is a promising technology for consumer electronics. Battery life and package size are essential constraints for AI applications on edge devices. Thus, an efficient hardware architecture is important to support deep neural network (DNN) AI algorithms. The critical concerns are the high memory bandwidth and multichannel computation requirements of DNN processing. Conventional AI processors exploit complex memory pads, dedicated processing element (PE) buffers, and mass shift registers to support data reuse for memory bandwidth reduction. However, such architectures incur significant area overhead and power consumption. This paper proposes a novel channel-interleaved memory (CIM) footprint and dual-level memory pad (DLMP) control to enhance memory bandwidth utilization and simplify the memory pad circuit. Interleaved channel data are read from the memory bus with a single access and stored in a ping-pong buffer for reuse. Dynamic power is reduced by replacing the shift register PE mechanism with simplified mux selection. A joint stationary data reuse (JSDR) approach is adopted to process interleaved channel data efficiently. Finally, a hybrid memory buffer (HMB) reduces on-chip memory use through dynamic memory allocation. Experimental results demonstrate that the proposed architecture achieves a state-of-the-art area efficiency of 207.4 GOPS/mm2 while maintaining a high power efficiency of 3.53 TOPS/W. |
---|---|
ISSN: | 0098-3063 1558-4127 |
DOI: | 10.1109/TCE.2023.3323644 |