Loading…

Learn & drop: fast learning of cnns based on layer dropping

This paper proposes a new method to improve the training efficiency of deep convolutional neural networks. During training, the method evaluates scores to measure how much each layer’s parameters change and whether the layer will continue learning or not. Based on these scores, the network is scaled...

Full description

Saved in:

Bibliographic Details
Published in:	Neural computing & applications 2024-06, Vol.36 (18), p.10839-10851
Main Authors:	Cruciata, Giorgio, Cruciata, Luca, Lo Presti, Liliana, van Gemert, Jan, La Cascia, Marco
Format:	Article
Language:	English
Subjects:	Accuracy Artificial Intelligence Artificial neural networks Back propagation networks Computational Biology/Bioinformatics Computational Science and Engineering Computer Science Data Mining and Knowledge Discovery Image Processing and Computer Vision Learning Original Article Parameters Probability and Statistics in Computer Science Propagation
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites	cdi_FETCH-LOGICAL-c2293-a2f5b7b6166b586980a6238ff62c61dbd11739c19766372abfacbdd5868baa8c3
container_end_page	10851
container_issue	18
container_start_page	10839
container_title	Neural computing & applications
container_volume	36
creator	Cruciata, Giorgio Cruciata, Luca Lo Presti, Liliana van Gemert, Jan La Cascia, Marco
description	This paper proposes a new method to improve the training efficiency of deep convolutional neural networks. During training, the method evaluates scores to measure how much each layer’s parameters change and whether the layer will continue learning or not. Based on these scores, the network is scaled down such that the number of parameters to be learned is reduced, yielding a speed-up in training. Unlike state-of-the-art methods that try to compress the network to be used in the inference phase or to limit the number of operations performed in the back-propagation phase, the proposed method is novel in that it focuses on reducing the number of operations performed by the network in the forward propagation during training. The proposed training strategy has been validated on two widely used architecture families: VGG and ResNet. Experiments on MNIST, CIFAR-10 and Imagenette show that, with the proposed method, the training time of the models is more than halved without significantly impacting accuracy. The FLOPs reduction in the forward propagation during training ranges from 17.83% for VGG-11 to 83.74% for ResNet-152. As for the accuracy, the impact depends on the depth of the model and the decrease is between 0.26% and 2.38% for VGGs and between 0.4 and 3.2% for ResNets. These results demonstrate the effectiveness of the proposed technique in speeding up learning of CNNs. The technique will be especially useful in applications where fine-tuning or online training of convolutional models is required, for instance because data arrive sequentially.
doi_str_mv	10.1007/s00521-024-09592-3
format	article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3062303303</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3062303303</sourcerecordid><originalsourceid>FETCH-LOGICAL-c2293-a2f5b7b6166b586980a6238ff62c61dbd11739c19766372abfacbdd5868baa8c3</originalsourceid><addsrcrecordid>eNp9kE1LAzEQhoMoWFf_gKeA4C06SXaziZ6k1A8oeNFzSLJJaanZNWkP_fdmu4I3YWBg3o-BB6FrCncUoL3PAA2jBFhNQDWKEX6CZrTmnHBo5CmagaqLLGp-ji5y3gBALWQzQ49Lb1LEt7hL_fCAg8k7vB1P67jCfcAuxoytyb7DfcRbc_DpaB2KfonOgtlmf_W7K_T5vPiYv5Ll-8vb_GlJHGOKE8NCY1srqBC2kUJJMIJxGYJgTtDOdpS2XDmqWiF4y4wNxtmuK1ZpjZGOV-hm6h1S_733eac3_T7F8lJzKFXAx6kQm1wu9TknH_SQ1l8mHTQFPULSEyRdIOkjJD2G-BTKxRxXPv1V_5P6AdKhZ9s</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3062303303</pqid></control><display><type>article</type><title>Learn & drop: fast learning of cnns based on layer dropping</title><source>Springer Nature</source><creator>Cruciata, Giorgio ; Cruciata, Luca ; Lo Presti, Liliana ; van Gemert, Jan ; La Cascia, Marco</creator><creatorcontrib>Cruciata, Giorgio ; Cruciata, Luca ; Lo Presti, Liliana ; van Gemert, Jan ; La Cascia, Marco</creatorcontrib><description>This paper proposes a new method to improve the training efficiency of deep convolutional neural networks. During training, the method evaluates scores to measure how much each layer’s parameters change and whether the layer will continue learning or not. Based on these scores, the network is scaled down such that the number of parameters to be learned is reduced, yielding a speed-up in training. Unlike state-of-the-art methods that try to compress the network to be used in the inference phase or to limit the number of operations performed in the back-propagation phase, the proposed method is novel in that it focuses on reducing the number of operations performed by the network in the forward propagation during training. The proposed training strategy has been validated on two widely used architecture families: VGG and ResNet. Experiments on MNIST, CIFAR-10 and Imagenette show that, with the proposed method, the training time of the models is more than halved without significantly impacting accuracy. The FLOPs reduction in the forward propagation during training ranges from 17.83% for VGG-11 to 83.74% for ResNet-152. As for the accuracy, the impact depends on the depth of the model and the decrease is between 0.26% and 2.38% for VGGs and between 0.4 and 3.2% for ResNets. These results demonstrate the effectiveness of the proposed technique in speeding up learning of CNNs. The technique will be especially useful in applications where fine-tuning or online training of convolutional models is required, for instance because data arrive sequentially.</description><identifier>ISSN: 0941-0643</identifier><identifier>EISSN: 1433-3058</identifier><identifier>DOI: 10.1007/s00521-024-09592-3</identifier><language>eng</language><publisher>London: Springer London</publisher><subject>Accuracy ; Artificial Intelligence ; Artificial neural networks ; Back propagation networks ; Computational Biology/Bioinformatics ; Computational Science and Engineering ; Computer Science ; Data Mining and Knowledge Discovery ; Image Processing and Computer Vision ; Learning ; Original Article ; Parameters ; Probability and Statistics in Computer Science ; Propagation</subject><ispartof>Neural computing & applications, 2024-06, Vol.36 (18), p.10839-10851</ispartof><rights>The Author(s) 2024</rights><rights>The Author(s) 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c2293-a2f5b7b6166b586980a6238ff62c61dbd11739c19766372abfacbdd5868baa8c3</cites><orcidid>0000-0002-9493-5442</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Cruciata, Giorgio</creatorcontrib><creatorcontrib>Cruciata, Luca</creatorcontrib><creatorcontrib>Lo Presti, Liliana</creatorcontrib><creatorcontrib>van Gemert, Jan</creatorcontrib><creatorcontrib>La Cascia, Marco</creatorcontrib><title>Learn & drop: fast learning of cnns based on layer dropping</title><title>Neural computing & applications</title><addtitle>Neural Comput & Applic</addtitle><description>This paper proposes a new method to improve the training efficiency of deep convolutional neural networks. During training, the method evaluates scores to measure how much each layer’s parameters change and whether the layer will continue learning or not. Based on these scores, the network is scaled down such that the number of parameters to be learned is reduced, yielding a speed-up in training. Unlike state-of-the-art methods that try to compress the network to be used in the inference phase or to limit the number of operations performed in the back-propagation phase, the proposed method is novel in that it focuses on reducing the number of operations performed by the network in the forward propagation during training. The proposed training strategy has been validated on two widely used architecture families: VGG and ResNet. Experiments on MNIST, CIFAR-10 and Imagenette show that, with the proposed method, the training time of the models is more than halved without significantly impacting accuracy. The FLOPs reduction in the forward propagation during training ranges from 17.83% for VGG-11 to 83.74% for ResNet-152. As for the accuracy, the impact depends on the depth of the model and the decrease is between 0.26% and 2.38% for VGGs and between 0.4 and 3.2% for ResNets. These results demonstrate the effectiveness of the proposed technique in speeding up learning of CNNs. The technique will be especially useful in applications where fine-tuning or online training of convolutional models is required, for instance because data arrive sequentially.</description><subject>Accuracy</subject><subject>Artificial Intelligence</subject><subject>Artificial neural networks</subject><subject>Back propagation networks</subject><subject>Computational Biology/Bioinformatics</subject><subject>Computational Science and Engineering</subject><subject>Computer Science</subject><subject>Data Mining and Knowledge Discovery</subject><subject>Image Processing and Computer Vision</subject><subject>Learning</subject><subject>Original Article</subject><subject>Parameters</subject><subject>Probability and Statistics in Computer Science</subject><subject>Propagation</subject><issn>0941-0643</issn><issn>1433-3058</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kE1LAzEQhoMoWFf_gKeA4C06SXaziZ6k1A8oeNFzSLJJaanZNWkP_fdmu4I3YWBg3o-BB6FrCncUoL3PAA2jBFhNQDWKEX6CZrTmnHBo5CmagaqLLGp-ji5y3gBALWQzQ49Lb1LEt7hL_fCAg8k7vB1P67jCfcAuxoytyb7DfcRbc_DpaB2KfonOgtlmf_W7K_T5vPiYv5Ll-8vb_GlJHGOKE8NCY1srqBC2kUJJMIJxGYJgTtDOdpS2XDmqWiF4y4wNxtmuK1ZpjZGOV-hm6h1S_733eac3_T7F8lJzKFXAx6kQm1wu9TknH_SQ1l8mHTQFPULSEyRdIOkjJD2G-BTKxRxXPv1V_5P6AdKhZ9s</recordid><startdate>20240601</startdate><enddate>20240601</enddate><creator>Cruciata, Giorgio</creator><creator>Cruciata, Luca</creator><creator>Lo Presti, Liliana</creator><creator>van Gemert, Jan</creator><creator>La Cascia, Marco</creator><general>Springer London</general><general>Springer Nature B.V</general><scope>C6C</scope><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-9493-5442</orcidid></search><sort><creationdate>20240601</creationdate><title>Learn & drop: fast learning of cnns based on layer dropping</title><author>Cruciata, Giorgio ; Cruciata, Luca ; Lo Presti, Liliana ; van Gemert, Jan ; La Cascia, Marco</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c2293-a2f5b7b6166b586980a6238ff62c61dbd11739c19766372abfacbdd5868baa8c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Accuracy</topic><topic>Artificial Intelligence</topic><topic>Artificial neural networks</topic><topic>Back propagation networks</topic><topic>Computational Biology/Bioinformatics</topic><topic>Computational Science and Engineering</topic><topic>Computer Science</topic><topic>Data Mining and Knowledge Discovery</topic><topic>Image Processing and Computer Vision</topic><topic>Learning</topic><topic>Original Article</topic><topic>Parameters</topic><topic>Probability and Statistics in Computer Science</topic><topic>Propagation</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Cruciata, Giorgio</creatorcontrib><creatorcontrib>Cruciata, Luca</creatorcontrib><creatorcontrib>Lo Presti, Liliana</creatorcontrib><creatorcontrib>van Gemert, Jan</creatorcontrib><creatorcontrib>La Cascia, Marco</creatorcontrib><collection>Springer_OA刊</collection><collection>CrossRef</collection><jtitle>Neural computing & applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Cruciata, Giorgio</au><au>Cruciata, Luca</au><au>Lo Presti, Liliana</au><au>van Gemert, Jan</au><au>La Cascia, Marco</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Learn & drop: fast learning of cnns based on layer dropping</atitle><jtitle>Neural computing & applications</jtitle><stitle>Neural Comput & Applic</stitle><date>2024-06-01</date><risdate>2024</risdate><volume>36</volume><issue>18</issue><spage>10839</spage><epage>10851</epage><pages>10839-10851</pages><issn>0941-0643</issn><eissn>1433-3058</eissn><abstract>This paper proposes a new method to improve the training efficiency of deep convolutional neural networks. During training, the method evaluates scores to measure how much each layer’s parameters change and whether the layer will continue learning or not. Based on these scores, the network is scaled down such that the number of parameters to be learned is reduced, yielding a speed-up in training. Unlike state-of-the-art methods that try to compress the network to be used in the inference phase or to limit the number of operations performed in the back-propagation phase, the proposed method is novel in that it focuses on reducing the number of operations performed by the network in the forward propagation during training. The proposed training strategy has been validated on two widely used architecture families: VGG and ResNet. Experiments on MNIST, CIFAR-10 and Imagenette show that, with the proposed method, the training time of the models is more than halved without significantly impacting accuracy. The FLOPs reduction in the forward propagation during training ranges from 17.83% for VGG-11 to 83.74% for ResNet-152. As for the accuracy, the impact depends on the depth of the model and the decrease is between 0.26% and 2.38% for VGGs and between 0.4 and 3.2% for ResNets. These results demonstrate the effectiveness of the proposed technique in speeding up learning of CNNs. The technique will be especially useful in applications where fine-tuning or online training of convolutional models is required, for instance because data arrive sequentially.</abstract><cop>London</cop><pub>Springer London</pub><doi>10.1007/s00521-024-09592-3</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-9493-5442</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 0941-0643
ispartof	Neural computing & applications, 2024-06, Vol.36 (18), p.10839-10851
issn	0941-0643 1433-3058
language	eng
recordid	cdi_proquest_journals_3062303303
source	Springer Nature
subjects	Accuracy Artificial Intelligence Artificial neural networks Back propagation networks Computational Biology/Bioinformatics Computational Science and Engineering Computer Science Data Mining and Knowledge Discovery Image Processing and Computer Vision Learning Original Article Parameters Probability and Statistics in Computer Science Propagation
title	Learn & drop: fast learning of cnns based on layer dropping
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T03%3A20%3A51IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Learn%20&%20drop:%20fast%20learning%20of%20cnns%20based%20on%20layer%20dropping&rft.jtitle=Neural%20computing%20&%20applications&rft.au=Cruciata,%20Giorgio&rft.date=2024-06-01&rft.volume=36&rft.issue=18&rft.spage=10839&rft.epage=10851&rft.pages=10839-10851&rft.issn=0941-0643&rft.eissn=1433-3058&rft_id=info:doi/10.1007/s00521-024-09592-3&rft_dat=%3Cproquest_cross%3E3062303303%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c2293-a2f5b7b6166b586980a6238ff62c61dbd11739c19766372abfacbdd5868baa8c3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3062303303&rft_id=info:pmid/&rfr_iscdi=true