Loading…

ARPruning: An automatic channel pruning based on attention map ranking

Structured pruning is a representative model compression technology for convolutional neural networks (CNNs), aiming to prune some less important filters or channels of CNNs. Most recent structured pruning methods have established some criteria to measure the importance of filters, which are mainly...

Full description

Saved in:
Bibliographic Details
Published in:Neural networks 2024-06, Vol.174, p.106220-106220, Article 106220
Main Authors: Yuan, Tongtong, Li, Zulin, Liu, Bo, Tang, Yinan, Liu, Yujia
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Structured pruning is a representative model compression technology for convolutional neural networks (CNNs), aiming to prune some less important filters or channels of CNNs. Most recent structured pruning methods have established some criteria to measure the importance of filters, which are mainly based on the magnitude of weights or other parameters in CNNs. However, these judgment criteria lack explainability, and it is insufficient to simply rely on the numerical values of the network parameters to assess the relationship between the channel and the model performance. Moreover, directly utilizing these pruning criteria for global pruning may lead to suboptimal solutions, therefore, it is necessary to complement search algorithms to determine the pruning ratio for each layer. To address these issues, we propose ARPruning (Attention-map-based Ranking Pruning), which reconstructs a new pruning criterion as the importance of the intra-layer channels and further develops a new local neighborhood search algorithm for determining the optimal inter-layer pruning ratio. To measure the relationship between the channel to be pruned and the model performance, we construct an intra-layer channel importance criterion by considering the attention map for each layer. Then, we propose an automatic pruning strategy searching method that can search for the optimal solution effectively and efficiently. By integrating the well-designed pruning criteria and search strategy, our ARPruning can not only maintain a high compression rate but also achieve outstanding accuracy. In our work, it is also experimentally concluded that compared with state-of-the-art pruning methods, our ARPruning method is capable of achieving better compression results. The code can be obtained at https://github.com/dozingLee/ARPruning.
ISSN:0893-6080
1879-2782
DOI:10.1016/j.neunet.2024.106220