Loading…

Fixed Point Implementation of Tiny-Yolo-v2 using OpenCL on FPGA

Deep Convolutional Neural Network (CNN) algorithm has recently gained popularity in many applications such as image classification, video analytic and object detection. Being compute-intensive and memory expensive, CNN-based algorithms are hard to be implemented on the embedded device. Although rece...

Full description

Saved in:

Bibliographic Details
Published in:	International journal of advanced computer science & applications 2018, Vol.9 (10)
Main Authors:	Wai, Yap June, bin, Zulkalnain, Irwan, Sani, Kim, Lim
Format:	Article
Language:	English
Subjects:	Algorithms Artificial neural networks Chips (memory devices) Electronic devices Field programmable gate arrays Image classification Object recognition Software development tools
Citations:	Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Deep Convolutional Neural Network (CNN) algorithm has recently gained popularity in many applications such as image classification, video analytic and object detection. Being compute-intensive and memory expensive, CNN-based algorithms are hard to be implemented on the embedded device. Although recent studies have explored the hardware implementation of CNN-based object classification models such as AlexNet and VGG, there is still a rare implementation of CNN-based object detection model on Field Programmable Gate Array (FPGA). Consequently, this study proposes the fixed-point (16-bit) implementation of CNN-based object detection model: Tiny-Yolo-v2 on Cyclone V PCIe Development Kit FPGA board using High-Level-Synthesis (HLS) tool: OpenCL. Considering FPGA resource constraints in term of computational resources, memory bandwidth, and on-chip memory, a data pre-processing approach is proposed to merge the batch normalization into convolution layer. To the best of our knowledge, this is the first implementation of Tiny-Yolo-v2 object detection algorithm on FPGA using Intel FPGA Software Development Kit (SDK) for OpenCL. Finally, the proposed implementation achieves a peak performance of 21 GOPs under 100 MHz working frequency.
ISSN:	2158-107X 2156-5570
DOI:	10.14569/IJACSA.2018.091062