Loading…
MedQ: Lossless ultra-low-bit neural network quantization for medical image segmentation
•A neural network quantization method is proposed for medical image segmentation.•Lossless performance is achieved with up to 14× theoretical speedup and 15× compression.•A novel adaptive quantizer and a radical residual connection scheme is developed.•Extensive experiments on LiTS and BRATS2020 dem...
Saved in:
Published in: | Medical image analysis 2021-10, Vol.73, p.102200-102200, Article 102200 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | •A neural network quantization method is proposed for medical image segmentation.•Lossless performance is achieved with up to 14× theoretical speedup and 15× compression.•A novel adaptive quantizer and a radical residual connection scheme is developed.•Extensive experiments on LiTS and BRATS2020 demonstrate our method’s effectiveness.
[Display omitted]
Implementing deep convolutional neural networks (CNNs) with boolean arithmetic is ideal for eliminating the notoriously high computational expense of deep learning models. However, although lossless model compression via weight-only quantization has been achieved in previous works, it is still an open problem about how to reduce the computation precision of CNNs without losing performance, especially for medical image segmentation tasks where data dimension is high and annotation is scarce. This paper presents a novel CNN quantization framework that can squeeze a deep model (both parameters and activation) to extremely low bitwidth, e.g., 1∼2 bits, while maintaining its high performance. In the new method, we first design a strong baseline quantizer with an optimizable quantization range. Then, to relieve the back-propagation difficulty caused by the discontinuous quantization function, we design a radical residual connection scheme that allows gradients to flow through every quantized layer freely. Moreover, a tanh-based derivative function is used to further boost gradient flow and a distributional loss is employed to regularize the model output. Extensive experiments and ablation studies are conducted on two well-established public 3D segmentation datasets, i.e., BRATS2020 and LiTS. Experimental results evidence that our framework not only outperforms state-of-the-art quantization approaches significantly, but also achieves lossless performance on both datasets with ternary (2-bit) quantization. |
---|---|
ISSN: | 1361-8415 1361-8423 |
DOI: | 10.1016/j.media.2021.102200 |