Loading…

Optimizing Weight Value Quantization for CNN Inference

The size and complexity of CNN models are increasing and as a result they are requiring more computational and memory resources to be used effectively. Use of a lower bit width numerical representation such as binary, ternary or several bit width has been studied extensively so as to reduce the requ...

Full description

Saved in:
Bibliographic Details
Main Authors: Nogami, Wakana, Ikegami, Tsutomu, O'uchi, Shin-ichi, Takano, Ryousei, Kudoh, Tomohiro
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The size and complexity of CNN models are increasing and as a result they are requiring more computational and memory resources to be used effectively. Use of a lower bit width numerical representation such as binary, ternary or several bit width has been studied extensively so as to reduce the required resources. However, the representation capability of such extremely low bit width is not always sufficient and the accuracy obtained for some CNN models and data is low. There are some prior studies that use moderate lower bit width with well-known numerical representations such as fixed point or logarithmic representation. It is not apparent, however, whether those representations are optimal for maintaining high accuracy. In this paper, we investigated the numerical quantization from the ground up, and introduced a novel "Variable Bin-size Quantization (VBQ)" representation in which quantization bin boundaries are optimized to obtain maximum accuracy for each CNN model. A genetic algorithm was employed to optimize the bin boundaries of VBQ. Additionally, since the appropriate bit width to obtain sufficient accuracy cannot be determined in advance, we attempted to use the parameters obtained by a training process using higher precision representation (FP32), and used quantization in inference only. This reduced the required large computational resource cost for training. During the process of tuning VBQ boundaries using a genetic algorithm, we discovered that the optimal distribution of bins can be approximated by an equation with two parameters. We then used simulated annealing for finding the optimal parameters of the equation for AlexNet and VGG16. As a result, AlexNet and VGG16 with our 4-bit quantization achieved top-5 accuracy at 74.8% and 86.3% respectively, which were comparable to 76.3% and 88.1% obtained by FP32. Thus, VBQ combined with the approximate equation and the simulated annealing scheme can achieve similar levels of accuracy with less resources and reduced computational cost compared to other current approaches.
ISSN:2161-4407
DOI:10.1109/IJCNN.2019.8852331