Loading…

1xN Pattern for Pruning Convolutional Neural Networks

Though network pruning receives popularity in reducing the complexity of convolutional neural networks (CNNs), it remains an open issue to concurrently maintain model accuracy as well as achieve significant speedups on general CPUs. In this paper, we propose a novel 1×N pruning pattern to break this...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on pattern analysis and machine intelligence 2023-04, Vol.45 (4), p.3999-4008
Main Authors:	Lin, Mingbao, Zhang, Yuxin, Li, Yuchao, Chen, Bohong, Chao, Fei, Wang, Mengdi, Li, Shen, Tian, Yonghong, Ji, Rongrong
Format:	Article
Language:	English
Subjects:	Accuracy Artificial neural networks CNNs Convolution Convolutional neural networks CPUs acceleration Filtering algorithms Indexes Kernel Model accuracy Network pruning Neural networks Pruning pruning pattern Shape Training Workflow
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by	cdi_FETCH-LOGICAL-c395t-31801d8ca1f8941c5d5b608830d4c37d0552906b5c55d958f3df87c6a7b83613
cites	cdi_FETCH-LOGICAL-c395t-31801d8ca1f8941c5d5b608830d4c37d0552906b5c55d958f3df87c6a7b83613
container_end_page	4008
container_issue	4
container_start_page	3999
container_title	IEEE transactions on pattern analysis and machine intelligence
container_volume	45
creator	Lin, Mingbao Zhang, Yuxin Li, Yuchao Chen, Bohong Chao, Fei Wang, Mengdi Li, Shen Tian, Yonghong Ji, Rongrong
description	Though network pruning receives popularity in reducing the complexity of convolutional neural networks (CNNs), it remains an open issue to concurrently maintain model accuracy as well as achieve significant speedups on general CPUs. In this paper, we propose a novel 1×N pruning pattern to break this limitation. In particular, consecutive N output kernels with the same input channel index are grouped into one block, which serves as a basic pruning granularity of our pruning pattern. Our 1×N pattern prunes these blocks considered unimportant. We also provide a workflow of filter rearrangement that first rearranges the weight matrix in the output channel dimension to derive more influential blocks for accuracy improvements and then applies similar rearrangement to the next-layer weights in the input channel dimension to ensure correct convolutional operations. Moreover, the output computation after our 1×N pruning can be realized via a parallelized block-wise vectorized operation, leading to significant speedups on general CPUs. The efficacy of our pruning pattern is proved with experiments on ILSVRC-2012. For example, given the pruning rate of 50% and N=4, our pattern obtains about 3.0% improvements over filter pruning in the top-1 accuracy of MobileNet-V2. Meanwhile, it obtains 56.04ms inference savings on Cortex-A7 CPU over weight pruning. Our project is made available at https://github.com/lmbxmu/1xN .
doi_str_mv	10.1109/TPAMI.2022.3195774
format	article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TPAMI_2022_3195774</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9847369</ieee_id><sourcerecordid>2784552619</sourcerecordid><originalsourceid>FETCH-LOGICAL-c395t-31801d8ca1f8941c5d5b608830d4c37d0552906b5c55d958f3df87c6a7b83613</originalsourceid><addsrcrecordid>eNpdkMtOwzAQRS0EoqXwAyChSGzYpHjsjB9LVPGoVEoX3Vtp4qCUNC52wuPvSR90wWoWc-6d0SHkEugQgOq7-ez-ZTxklLEhB41SJkekD5rrmCPXx6RPQbBYKaZ65CyEJaWQIOWnpMdRg0QJfYLwPY1madNYX0eF89HMt3VZv0UjV3-6qm1KV6dVNLWt347my_n3cE5OirQK9mI_B2T--DAfPceT16fx6H4SZ1xjE3NQFHKVpVAonUCGOS4EVYrTPMm4zCki01QsMEPMNaqC54WSmUjlQnEBfEBud7Vr7z5aGxqzKkNmqyqtrWuDYUJLIRFBdujNP3TpWt-93lFSJd0h0ZkZELajMu9C8LYwa1-uUv9jgJqNU7N1ajZOzd5pF7reV7eLlc0PkT-JHXC1A0pr7WGtVSK50PwXmIJ4Yg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2784552619</pqid></control><display><type>article</type><title>1xN Pattern for Pruning Convolutional Neural Networks</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Lin, Mingbao ; Zhang, Yuxin ; Li, Yuchao ; Chen, Bohong ; Chao, Fei ; Wang, Mengdi ; Li, Shen ; Tian, Yonghong ; Ji, Rongrong</creator><creatorcontrib>Lin, Mingbao ; Zhang, Yuxin ; Li, Yuchao ; Chen, Bohong ; Chao, Fei ; Wang, Mengdi ; Li, Shen ; Tian, Yonghong ; Ji, Rongrong</creatorcontrib><description>Though network pruning receives popularity in reducing the complexity of convolutional neural networks (CNNs), it remains an open issue to concurrently maintain model accuracy as well as achieve significant speedups on general CPUs. In this paper, we propose a novel 1×N pruning pattern to break this limitation. In particular, consecutive N output kernels with the same input channel index are grouped into one block, which serves as a basic pruning granularity of our pruning pattern. Our 1×N pattern prunes these blocks considered unimportant. We also provide a workflow of filter rearrangement that first rearranges the weight matrix in the output channel dimension to derive more influential blocks for accuracy improvements and then applies similar rearrangement to the next-layer weights in the input channel dimension to ensure correct convolutional operations. Moreover, the output computation after our 1×N pruning can be realized via a parallelized block-wise vectorized operation, leading to significant speedups on general CPUs. The efficacy of our pruning pattern is proved with experiments on ILSVRC-2012. For example, given the pruning rate of 50% and N=4, our pattern obtains about 3.0% improvements over filter pruning in the top-1 accuracy of MobileNet-V2. Meanwhile, it obtains 56.04ms inference savings on Cortex-A7 CPU over weight pruning. Our project is made available at https://github.com/lmbxmu/1xN .</description><identifier>ISSN: 0162-8828</identifier><identifier>EISSN: 1939-3539</identifier><identifier>EISSN: 2160-9292</identifier><identifier>DOI: 10.1109/TPAMI.2022.3195774</identifier><identifier>PMID: 35917571</identifier><identifier>CODEN: ITPIDJ</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Accuracy ; Artificial neural networks ; CNNs ; Convolution ; Convolutional neural networks ; CPUs acceleration ; Filtering algorithms ; Indexes ; Kernel ; Model accuracy ; Network pruning ; Neural networks ; Pruning ; pruning pattern ; Shape ; Training ; Workflow</subject><ispartof>IEEE transactions on pattern analysis and machine intelligence, 2023-04, Vol.45 (4), p.3999-4008</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c395t-31801d8ca1f8941c5d5b608830d4c37d0552906b5c55d958f3df87c6a7b83613</citedby><cites>FETCH-LOGICAL-c395t-31801d8ca1f8941c5d5b608830d4c37d0552906b5c55d958f3df87c6a7b83613</cites><orcidid>0000-0002-4409-7030 ; 0000-0002-2978-5935 ; 0000-0001-9163-2932 ; 0000-0003-2442-1750 ; 0000-0002-6928-2638 ; 0000-0003-1764-1894</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9847369$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/35917571$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Lin, Mingbao</creatorcontrib><creatorcontrib>Zhang, Yuxin</creatorcontrib><creatorcontrib>Li, Yuchao</creatorcontrib><creatorcontrib>Chen, Bohong</creatorcontrib><creatorcontrib>Chao, Fei</creatorcontrib><creatorcontrib>Wang, Mengdi</creatorcontrib><creatorcontrib>Li, Shen</creatorcontrib><creatorcontrib>Tian, Yonghong</creatorcontrib><creatorcontrib>Ji, Rongrong</creatorcontrib><title>1xN Pattern for Pruning Convolutional Neural Networks</title><title>IEEE transactions on pattern analysis and machine intelligence</title><addtitle>TPAMI</addtitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><description>Though network pruning receives popularity in reducing the complexity of convolutional neural networks (CNNs), it remains an open issue to concurrently maintain model accuracy as well as achieve significant speedups on general CPUs. In this paper, we propose a novel 1×N pruning pattern to break this limitation. In particular, consecutive N output kernels with the same input channel index are grouped into one block, which serves as a basic pruning granularity of our pruning pattern. Our 1×N pattern prunes these blocks considered unimportant. We also provide a workflow of filter rearrangement that first rearranges the weight matrix in the output channel dimension to derive more influential blocks for accuracy improvements and then applies similar rearrangement to the next-layer weights in the input channel dimension to ensure correct convolutional operations. Moreover, the output computation after our 1×N pruning can be realized via a parallelized block-wise vectorized operation, leading to significant speedups on general CPUs. The efficacy of our pruning pattern is proved with experiments on ILSVRC-2012. For example, given the pruning rate of 50% and N=4, our pattern obtains about 3.0% improvements over filter pruning in the top-1 accuracy of MobileNet-V2. Meanwhile, it obtains 56.04ms inference savings on Cortex-A7 CPU over weight pruning. Our project is made available at https://github.com/lmbxmu/1xN .</description><subject>Accuracy</subject><subject>Artificial neural networks</subject><subject>CNNs</subject><subject>Convolution</subject><subject>Convolutional neural networks</subject><subject>CPUs acceleration</subject><subject>Filtering algorithms</subject><subject>Indexes</subject><subject>Kernel</subject><subject>Model accuracy</subject><subject>Network pruning</subject><subject>Neural networks</subject><subject>Pruning</subject><subject>pruning pattern</subject><subject>Shape</subject><subject>Training</subject><subject>Workflow</subject><issn>0162-8828</issn><issn>1939-3539</issn><issn>2160-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNpdkMtOwzAQRS0EoqXwAyChSGzYpHjsjB9LVPGoVEoX3Vtp4qCUNC52wuPvSR90wWoWc-6d0SHkEugQgOq7-ez-ZTxklLEhB41SJkekD5rrmCPXx6RPQbBYKaZ65CyEJaWQIOWnpMdRg0QJfYLwPY1madNYX0eF89HMt3VZv0UjV3-6qm1KV6dVNLWt347my_n3cE5OirQK9mI_B2T--DAfPceT16fx6H4SZ1xjE3NQFHKVpVAonUCGOS4EVYrTPMm4zCki01QsMEPMNaqC54WSmUjlQnEBfEBud7Vr7z5aGxqzKkNmqyqtrWuDYUJLIRFBdujNP3TpWt-93lFSJd0h0ZkZELajMu9C8LYwa1-uUv9jgJqNU7N1ajZOzd5pF7reV7eLlc0PkT-JHXC1A0pr7WGtVSK50PwXmIJ4Yg</recordid><startdate>20230401</startdate><enddate>20230401</enddate><creator>Lin, Mingbao</creator><creator>Zhang, Yuxin</creator><creator>Li, Yuchao</creator><creator>Chen, Bohong</creator><creator>Chao, Fei</creator><creator>Wang, Mengdi</creator><creator>Li, Shen</creator><creator>Tian, Yonghong</creator><creator>Ji, Rongrong</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-4409-7030</orcidid><orcidid>https://orcid.org/0000-0002-2978-5935</orcidid><orcidid>https://orcid.org/0000-0001-9163-2932</orcidid><orcidid>https://orcid.org/0000-0003-2442-1750</orcidid><orcidid>https://orcid.org/0000-0002-6928-2638</orcidid><orcidid>https://orcid.org/0000-0003-1764-1894</orcidid></search><sort><creationdate>20230401</creationdate><title>1xN Pattern for Pruning Convolutional Neural Networks</title><author>Lin, Mingbao ; Zhang, Yuxin ; Li, Yuchao ; Chen, Bohong ; Chao, Fei ; Wang, Mengdi ; Li, Shen ; Tian, Yonghong ; Ji, Rongrong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c395t-31801d8ca1f8941c5d5b608830d4c37d0552906b5c55d958f3df87c6a7b83613</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Accuracy</topic><topic>Artificial neural networks</topic><topic>CNNs</topic><topic>Convolution</topic><topic>Convolutional neural networks</topic><topic>CPUs acceleration</topic><topic>Filtering algorithms</topic><topic>Indexes</topic><topic>Kernel</topic><topic>Model accuracy</topic><topic>Network pruning</topic><topic>Neural networks</topic><topic>Pruning</topic><topic>pruning pattern</topic><topic>Shape</topic><topic>Training</topic><topic>Workflow</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Lin, Mingbao</creatorcontrib><creatorcontrib>Zhang, Yuxin</creatorcontrib><creatorcontrib>Li, Yuchao</creatorcontrib><creatorcontrib>Chen, Bohong</creatorcontrib><creatorcontrib>Chao, Fei</creatorcontrib><creatorcontrib>Wang, Mengdi</creatorcontrib><creatorcontrib>Li, Shen</creatorcontrib><creatorcontrib>Tian, Yonghong</creatorcontrib><creatorcontrib>Ji, Rongrong</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) Online</collection><collection>IEEE/IET Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lin, Mingbao</au><au>Zhang, Yuxin</au><au>Li, Yuchao</au><au>Chen, Bohong</au><au>Chao, Fei</au><au>Wang, Mengdi</au><au>Li, Shen</au><au>Tian, Yonghong</au><au>Ji, Rongrong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>1xN Pattern for Pruning Convolutional Neural Networks</atitle><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle><stitle>TPAMI</stitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><date>2023-04-01</date><risdate>2023</risdate><volume>45</volume><issue>4</issue><spage>3999</spage><epage>4008</epage><pages>3999-4008</pages><issn>0162-8828</issn><eissn>1939-3539</eissn><eissn>2160-9292</eissn><coden>ITPIDJ</coden><abstract>Though network pruning receives popularity in reducing the complexity of convolutional neural networks (CNNs), it remains an open issue to concurrently maintain model accuracy as well as achieve significant speedups on general CPUs. In this paper, we propose a novel 1×N pruning pattern to break this limitation. In particular, consecutive N output kernels with the same input channel index are grouped into one block, which serves as a basic pruning granularity of our pruning pattern. Our 1×N pattern prunes these blocks considered unimportant. We also provide a workflow of filter rearrangement that first rearranges the weight matrix in the output channel dimension to derive more influential blocks for accuracy improvements and then applies similar rearrangement to the next-layer weights in the input channel dimension to ensure correct convolutional operations. Moreover, the output computation after our 1×N pruning can be realized via a parallelized block-wise vectorized operation, leading to significant speedups on general CPUs. The efficacy of our pruning pattern is proved with experiments on ILSVRC-2012. For example, given the pruning rate of 50% and N=4, our pattern obtains about 3.0% improvements over filter pruning in the top-1 accuracy of MobileNet-V2. Meanwhile, it obtains 56.04ms inference savings on Cortex-A7 CPU over weight pruning. Our project is made available at https://github.com/lmbxmu/1xN .</abstract><cop>United States</cop><pub>IEEE</pub><pmid>35917571</pmid><doi>10.1109/TPAMI.2022.3195774</doi><tpages>10</tpages><orcidid>https://orcid.org/0000-0002-4409-7030</orcidid><orcidid>https://orcid.org/0000-0002-2978-5935</orcidid><orcidid>https://orcid.org/0000-0001-9163-2932</orcidid><orcidid>https://orcid.org/0000-0003-2442-1750</orcidid><orcidid>https://orcid.org/0000-0002-6928-2638</orcidid><orcidid>https://orcid.org/0000-0003-1764-1894</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 0162-8828
ispartof	IEEE transactions on pattern analysis and machine intelligence, 2023-04, Vol.45 (4), p.3999-4008
issn	0162-8828 1939-3539 2160-9292
language	eng
recordid	cdi_crossref_primary_10_1109_TPAMI_2022_3195774
source	IEEE Electronic Library (IEL) Journals
subjects	Accuracy Artificial neural networks CNNs Convolution Convolutional neural networks CPUs acceleration Filtering algorithms Indexes Kernel Model accuracy Network pruning Neural networks Pruning pruning pattern Shape Training Workflow
title	1xN Pattern for Pruning Convolutional Neural Networks
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T05%3A11%3A11IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=1xN%20Pattern%20for%20Pruning%20Convolutional%20Neural%20Networks&rft.jtitle=IEEE%20transactions%20on%20pattern%20analysis%20and%20machine%20intelligence&rft.au=Lin,%20Mingbao&rft.date=2023-04-01&rft.volume=45&rft.issue=4&rft.spage=3999&rft.epage=4008&rft.pages=3999-4008&rft.issn=0162-8828&rft.eissn=1939-3539&rft.coden=ITPIDJ&rft_id=info:doi/10.1109/TPAMI.2022.3195774&rft_dat=%3Cproquest_cross%3E2784552619%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c395t-31801d8ca1f8941c5d5b608830d4c37d0552906b5c55d958f3df87c6a7b83613%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2784552619&rft_id=info:pmid/35917571&rft_ieee_id=9847369&rfr_iscdi=true