Loading…

A 3D Hybrid Optical-Electrical NoC Using Novel Mapping Strategy Based DCNN Dataflow Acceleration

A large number of multiply-accumulate operations and memory accesses required in deep convolutional neural networks (DCNN) leads to high latency and energy consumption (EC), that hinder their further applications. Dataflow-based acceleration schemes reduce memory accesses by leveraging reusable data...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on parallel and distributed systems 2024-07, Vol.35 (7), p.1139-1154
Main Authors: Zhang, Bowen, Gu, Huaxi, Zhang, Grace Li, Yang, Yintang, Ma, Ziteng, Schlichtmann, Ulf
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:A large number of multiply-accumulate operations and memory accesses required in deep convolutional neural networks (DCNN) leads to high latency and energy consumption (EC), that hinder their further applications. Dataflow-based acceleration schemes reduce memory accesses by leveraging reusable data in DCNNs. Row Stationary (RS) dataflow is a more advanced dataflow. In the convolutional layer acceleration of RS dataflow, the flexibility of mapping from logical processing element (LPE) sets to physical PE sets is relatively poor. The utilization of processing elements (PEs) is low. In this article, a novel mapping strategy based on genetic algorithm (GAMS) with the goal of optimizing EC is proposed. GAMS is designed to address the energy inefficiencies faced when mapping RS dataflow. A 3D hybrid optical-electrical Network-on-Chip (3DHOENoC) is proposed to further improve the communication efficiency, energy efficiency and the processing speed of DCNN. Simulation and evaluation results show that GAMS can achieve better mapping flexibility, higher PEs utilization and 15.9% improvement of execution speed on average. In addition, the execution time (ET) performance of processing the DCNN can be further improved by adopting the 3DHOENoC architecture with better communication parallelism.
ISSN:1045-9219
1558-2183
DOI:10.1109/TPDS.2024.3394747