Loading…

Quantization Method Integrated with Progressive Quantization and Distillation Learning

This paper proposed a quantization method based on the integration of progressive quantization and distillation learning, aiming to address the shortcomings of traditional quantization methods in maintaining model accuracy while reducing model size. This method converts the weight from floating poin...

Full description

Saved in:
Bibliographic Details
Published in:Procedia computer science 2023, Vol.228, p.281-290
Main Authors: Huang, Heyan, Pan, Bocheng, Wang, Liwei, Jiang, Cheng
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper proposed a quantization method based on the integration of progressive quantization and distillation learning, aiming to address the shortcomings of traditional quantization methods in maintaining model accuracy while reducing model size. This method converts the weight from floating point number to smaller integer number through progressive quantization, thus reducing the storage space required by the model. At the same time, distillation learning technology is used to integrate the gradually quantified model with the original floating-point model to improve the accuracy of the model. The experimental results show that the proposed method can maintain high accuracy of the model while reducing its size, and has better performance compared to traditional quantization methods. This method has a broad application prospect in model compression, and can be widely used in scenarios such as Edge device and cloud servers.
ISSN:1877-0509
1877-0509
DOI:10.1016/j.procs.2023.11.032