Loading…

1xN Pattern for Pruning Convolutional Neural Networks

Though network pruning receives popularity in reducing the complexity of convolutional neural networks (CNNs), it remains an open issue to concurrently maintain model accuracy as well as achieve significant speedups on general CPUs. In this paper, we propose a novel 1×N pruning pattern to break this...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on pattern analysis and machine intelligence 2023-04, Vol.45 (4), p.3999-4008
Main Authors: Lin, Mingbao, Zhang, Yuxin, Li, Yuchao, Chen, Bohong, Chao, Fei, Wang, Mengdi, Li, Shen, Tian, Yonghong, Ji, Rongrong
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c395t-31801d8ca1f8941c5d5b608830d4c37d0552906b5c55d958f3df87c6a7b83613
cites cdi_FETCH-LOGICAL-c395t-31801d8ca1f8941c5d5b608830d4c37d0552906b5c55d958f3df87c6a7b83613
container_end_page 4008
container_issue 4
container_start_page 3999
container_title IEEE transactions on pattern analysis and machine intelligence
container_volume 45
creator Lin, Mingbao
Zhang, Yuxin
Li, Yuchao
Chen, Bohong
Chao, Fei
Wang, Mengdi
Li, Shen
Tian, Yonghong
Ji, Rongrong
description Though network pruning receives popularity in reducing the complexity of convolutional neural networks (CNNs), it remains an open issue to concurrently maintain model accuracy as well as achieve significant speedups on general CPUs. In this paper, we propose a novel 1×N pruning pattern to break this limitation. In particular, consecutive N output kernels with the same input channel index are grouped into one block, which serves as a basic pruning granularity of our pruning pattern. Our 1×N pattern prunes these blocks considered unimportant. We also provide a workflow of filter rearrangement that first rearranges the weight matrix in the output channel dimension to derive more influential blocks for accuracy improvements and then applies similar rearrangement to the next-layer weights in the input channel dimension to ensure correct convolutional operations. Moreover, the output computation after our 1×N pruning can be realized via a parallelized block-wise vectorized operation, leading to significant speedups on general CPUs. The efficacy of our pruning pattern is proved with experiments on ILSVRC-2012. For example, given the pruning rate of 50% and N=4, our pattern obtains about 3.0% improvements over filter pruning in the top-1 accuracy of MobileNet-V2. Meanwhile, it obtains 56.04ms inference savings on Cortex-A7 CPU over weight pruning. Our project is made available at https://github.com/lmbxmu/1xN .
doi_str_mv 10.1109/TPAMI.2022.3195774
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TPAMI_2022_3195774</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9847369</ieee_id><sourcerecordid>2784552619</sourcerecordid><originalsourceid>FETCH-LOGICAL-c395t-31801d8ca1f8941c5d5b608830d4c37d0552906b5c55d958f3df87c6a7b83613</originalsourceid><addsrcrecordid>eNpdkMtOwzAQRS0EoqXwAyChSGzYpHjsjB9LVPGoVEoX3Vtp4qCUNC52wuPvSR90wWoWc-6d0SHkEugQgOq7-ez-ZTxklLEhB41SJkekD5rrmCPXx6RPQbBYKaZ65CyEJaWQIOWnpMdRg0QJfYLwPY1madNYX0eF89HMt3VZv0UjV3-6qm1KV6dVNLWt347my_n3cE5OirQK9mI_B2T--DAfPceT16fx6H4SZ1xjE3NQFHKVpVAonUCGOS4EVYrTPMm4zCki01QsMEPMNaqC54WSmUjlQnEBfEBud7Vr7z5aGxqzKkNmqyqtrWuDYUJLIRFBdujNP3TpWt-93lFSJd0h0ZkZELajMu9C8LYwa1-uUv9jgJqNU7N1ajZOzd5pF7reV7eLlc0PkT-JHXC1A0pr7WGtVSK50PwXmIJ4Yg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2784552619</pqid></control><display><type>article</type><title>1xN Pattern for Pruning Convolutional Neural Networks</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Lin, Mingbao ; Zhang, Yuxin ; Li, Yuchao ; Chen, Bohong ; Chao, Fei ; Wang, Mengdi ; Li, Shen ; Tian, Yonghong ; Ji, Rongrong</creator><creatorcontrib>Lin, Mingbao ; Zhang, Yuxin ; Li, Yuchao ; Chen, Bohong ; Chao, Fei ; Wang, Mengdi ; Li, Shen ; Tian, Yonghong ; Ji, Rongrong</creatorcontrib><description>Though network pruning receives popularity in reducing the complexity of convolutional neural networks (CNNs), it remains an open issue to concurrently maintain model accuracy as well as achieve significant speedups on general CPUs. In this paper, we propose a novel 1×N pruning pattern to break this limitation. In particular, consecutive N output kernels with the same input channel index are grouped into one block, which serves as a basic pruning granularity of our pruning pattern. Our 1×N pattern prunes these blocks considered unimportant. We also provide a workflow of filter rearrangement that first rearranges the weight matrix in the output channel dimension to derive more influential blocks for accuracy improvements and then applies similar rearrangement to the next-layer weights in the input channel dimension to ensure correct convolutional operations. Moreover, the output computation after our 1×N pruning can be realized via a parallelized block-wise vectorized operation, leading to significant speedups on general CPUs. The efficacy of our pruning pattern is proved with experiments on ILSVRC-2012. For example, given the pruning rate of 50% and N=4, our pattern obtains about 3.0% improvements over filter pruning in the top-1 accuracy of MobileNet-V2. Meanwhile, it obtains 56.04ms inference savings on Cortex-A7 CPU over weight pruning. Our project is made available at https://github.com/lmbxmu/1xN .</description><identifier>ISSN: 0162-8828</identifier><identifier>EISSN: 1939-3539</identifier><identifier>EISSN: 2160-9292</identifier><identifier>DOI: 10.1109/TPAMI.2022.3195774</identifier><identifier>PMID: 35917571</identifier><identifier>CODEN: ITPIDJ</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Accuracy ; Artificial neural networks ; CNNs ; Convolution ; Convolutional neural networks ; CPUs acceleration ; Filtering algorithms ; Indexes ; Kernel ; Model accuracy ; Network pruning ; Neural networks ; Pruning ; pruning pattern ; Shape ; Training ; Workflow</subject><ispartof>IEEE transactions on pattern analysis and machine intelligence, 2023-04, Vol.45 (4), p.3999-4008</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c395t-31801d8ca1f8941c5d5b608830d4c37d0552906b5c55d958f3df87c6a7b83613</citedby><cites>FETCH-LOGICAL-c395t-31801d8ca1f8941c5d5b608830d4c37d0552906b5c55d958f3df87c6a7b83613</cites><orcidid>0000-0002-4409-7030 ; 0000-0002-2978-5935 ; 0000-0001-9163-2932 ; 0000-0003-2442-1750 ; 0000-0002-6928-2638 ; 0000-0003-1764-1894</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9847369$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/35917571$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Lin, Mingbao</creatorcontrib><creatorcontrib>Zhang, Yuxin</creatorcontrib><creatorcontrib>Li, Yuchao</creatorcontrib><creatorcontrib>Chen, Bohong</creatorcontrib><creatorcontrib>Chao, Fei</creatorcontrib><creatorcontrib>Wang, Mengdi</creatorcontrib><creatorcontrib>Li, Shen</creatorcontrib><creatorcontrib>Tian, Yonghong</creatorcontrib><creatorcontrib>Ji, Rongrong</creatorcontrib><title>1xN Pattern for Pruning Convolutional Neural Networks</title><title>IEEE transactions on pattern analysis and machine intelligence</title><addtitle>TPAMI</addtitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><description>Though network pruning receives popularity in reducing the complexity of convolutional neural networks (CNNs), it remains an open issue to concurrently maintain model accuracy as well as achieve significant speedups on general CPUs. In this paper, we propose a novel 1×N pruning pattern to break this limitation. In particular, consecutive N output kernels with the same input channel index are grouped into one block, which serves as a basic pruning granularity of our pruning pattern. Our 1×N pattern prunes these blocks considered unimportant. We also provide a workflow of filter rearrangement that first rearranges the weight matrix in the output channel dimension to derive more influential blocks for accuracy improvements and then applies similar rearrangement to the next-layer weights in the input channel dimension to ensure correct convolutional operations. Moreover, the output computation after our 1×N pruning can be realized via a parallelized block-wise vectorized operation, leading to significant speedups on general CPUs. The efficacy of our pruning pattern is proved with experiments on ILSVRC-2012. For example, given the pruning rate of 50% and N=4, our pattern obtains about 3.0% improvements over filter pruning in the top-1 accuracy of MobileNet-V2. Meanwhile, it obtains 56.04ms inference savings on Cortex-A7 CPU over weight pruning. Our project is made available at https://github.com/lmbxmu/1xN .</description><subject>Accuracy</subject><subject>Artificial neural networks</subject><subject>CNNs</subject><subject>Convolution</subject><subject>Convolutional neural networks</subject><subject>CPUs acceleration</subject><subject>Filtering algorithms</subject><subject>Indexes</subject><subject>Kernel</subject><subject>Model accuracy</subject><subject>Network pruning</subject><subject>Neural networks</subject><subject>Pruning</subject><subject>pruning pattern</subject><subject>Shape</subject><subject>Training</subject><subject>Workflow</subject><issn>0162-8828</issn><issn>1939-3539</issn><issn>2160-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNpdkMtOwzAQRS0EoqXwAyChSGzYpHjsjB9LVPGoVEoX3Vtp4qCUNC52wuPvSR90wWoWc-6d0SHkEugQgOq7-ez-ZTxklLEhB41SJkekD5rrmCPXx6RPQbBYKaZ65CyEJaWQIOWnpMdRg0QJfYLwPY1madNYX0eF89HMt3VZv0UjV3-6qm1KV6dVNLWt347my_n3cE5OirQK9mI_B2T--DAfPceT16fx6H4SZ1xjE3NQFHKVpVAonUCGOS4EVYrTPMm4zCki01QsMEPMNaqC54WSmUjlQnEBfEBud7Vr7z5aGxqzKkNmqyqtrWuDYUJLIRFBdujNP3TpWt-93lFSJd0h0ZkZELajMu9C8LYwa1-uUv9jgJqNU7N1ajZOzd5pF7reV7eLlc0PkT-JHXC1A0pr7WGtVSK50PwXmIJ4Yg</recordid><startdate>20230401</startdate><enddate>20230401</enddate><creator>Lin, Mingbao</creator><creator>Zhang, Yuxin</creator><creator>Li, Yuchao</creator><creator>Chen, Bohong</creator><creator>Chao, Fei</creator><creator>Wang, Mengdi</creator><creator>Li, Shen</creator><creator>Tian, Yonghong</creator><creator>Ji, Rongrong</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-4409-7030</orcidid><orcidid>https://orcid.org/0000-0002-2978-5935</orcidid><orcidid>https://orcid.org/0000-0001-9163-2932</orcidid><orcidid>https://orcid.org/0000-0003-2442-1750</orcidid><orcidid>https://orcid.org/0000-0002-6928-2638</orcidid><orcidid>https://orcid.org/0000-0003-1764-1894</orcidid></search><sort><creationdate>20230401</creationdate><title>1xN Pattern for Pruning Convolutional Neural Networks</title><author>Lin, Mingbao ; Zhang, Yuxin ; Li, Yuchao ; Chen, Bohong ; Chao, Fei ; Wang, Mengdi ; Li, Shen ; Tian, Yonghong ; Ji, Rongrong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c395t-31801d8ca1f8941c5d5b608830d4c37d0552906b5c55d958f3df87c6a7b83613</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Accuracy</topic><topic>Artificial neural networks</topic><topic>CNNs</topic><topic>Convolution</topic><topic>Convolutional neural networks</topic><topic>CPUs acceleration</topic><topic>Filtering algorithms</topic><topic>Indexes</topic><topic>Kernel</topic><topic>Model accuracy</topic><topic>Network pruning</topic><topic>Neural networks</topic><topic>Pruning</topic><topic>pruning pattern</topic><topic>Shape</topic><topic>Training</topic><topic>Workflow</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Lin, Mingbao</creatorcontrib><creatorcontrib>Zhang, Yuxin</creatorcontrib><creatorcontrib>Li, Yuchao</creatorcontrib><creatorcontrib>Chen, Bohong</creatorcontrib><creatorcontrib>Chao, Fei</creatorcontrib><creatorcontrib>Wang, Mengdi</creatorcontrib><creatorcontrib>Li, Shen</creatorcontrib><creatorcontrib>Tian, Yonghong</creatorcontrib><creatorcontrib>Ji, Rongrong</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) Online</collection><collection>IEEE/IET Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lin, Mingbao</au><au>Zhang, Yuxin</au><au>Li, Yuchao</au><au>Chen, Bohong</au><au>Chao, Fei</au><au>Wang, Mengdi</au><au>Li, Shen</au><au>Tian, Yonghong</au><au>Ji, Rongrong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>1xN Pattern for Pruning Convolutional Neural Networks</atitle><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle><stitle>TPAMI</stitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><date>2023-04-01</date><risdate>2023</risdate><volume>45</volume><issue>4</issue><spage>3999</spage><epage>4008</epage><pages>3999-4008</pages><issn>0162-8828</issn><eissn>1939-3539</eissn><eissn>2160-9292</eissn><coden>ITPIDJ</coden><abstract>Though network pruning receives popularity in reducing the complexity of convolutional neural networks (CNNs), it remains an open issue to concurrently maintain model accuracy as well as achieve significant speedups on general CPUs. In this paper, we propose a novel 1×N pruning pattern to break this limitation. In particular, consecutive N output kernels with the same input channel index are grouped into one block, which serves as a basic pruning granularity of our pruning pattern. Our 1×N pattern prunes these blocks considered unimportant. We also provide a workflow of filter rearrangement that first rearranges the weight matrix in the output channel dimension to derive more influential blocks for accuracy improvements and then applies similar rearrangement to the next-layer weights in the input channel dimension to ensure correct convolutional operations. Moreover, the output computation after our 1×N pruning can be realized via a parallelized block-wise vectorized operation, leading to significant speedups on general CPUs. The efficacy of our pruning pattern is proved with experiments on ILSVRC-2012. For example, given the pruning rate of 50% and N=4, our pattern obtains about 3.0% improvements over filter pruning in the top-1 accuracy of MobileNet-V2. Meanwhile, it obtains 56.04ms inference savings on Cortex-A7 CPU over weight pruning. Our project is made available at https://github.com/lmbxmu/1xN .</abstract><cop>United States</cop><pub>IEEE</pub><pmid>35917571</pmid><doi>10.1109/TPAMI.2022.3195774</doi><tpages>10</tpages><orcidid>https://orcid.org/0000-0002-4409-7030</orcidid><orcidid>https://orcid.org/0000-0002-2978-5935</orcidid><orcidid>https://orcid.org/0000-0001-9163-2932</orcidid><orcidid>https://orcid.org/0000-0003-2442-1750</orcidid><orcidid>https://orcid.org/0000-0002-6928-2638</orcidid><orcidid>https://orcid.org/0000-0003-1764-1894</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0162-8828
ispartof IEEE transactions on pattern analysis and machine intelligence, 2023-04, Vol.45 (4), p.3999-4008
issn 0162-8828
1939-3539
2160-9292
language eng
recordid cdi_crossref_primary_10_1109_TPAMI_2022_3195774
source IEEE Electronic Library (IEL) Journals
subjects Accuracy
Artificial neural networks
CNNs
Convolution
Convolutional neural networks
CPUs acceleration
Filtering algorithms
Indexes
Kernel
Model accuracy
Network pruning
Neural networks
Pruning
pruning pattern
Shape
Training
Workflow
title 1xN Pattern for Pruning Convolutional Neural Networks
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T05%3A11%3A11IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=1xN%20Pattern%20for%20Pruning%20Convolutional%20Neural%20Networks&rft.jtitle=IEEE%20transactions%20on%20pattern%20analysis%20and%20machine%20intelligence&rft.au=Lin,%20Mingbao&rft.date=2023-04-01&rft.volume=45&rft.issue=4&rft.spage=3999&rft.epage=4008&rft.pages=3999-4008&rft.issn=0162-8828&rft.eissn=1939-3539&rft.coden=ITPIDJ&rft_id=info:doi/10.1109/TPAMI.2022.3195774&rft_dat=%3Cproquest_cross%3E2784552619%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c395t-31801d8ca1f8941c5d5b608830d4c37d0552906b5c55d958f3df87c6a7b83613%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2784552619&rft_id=info:pmid/35917571&rft_ieee_id=9847369&rfr_iscdi=true