Loading…
Adaptive Low-Rank Regularization with Damping Sequences to Restrict Lazy Weights in Deep Networks
Overfitting is one of the critical problems in deep neural networks. Many regularization schemes try to prevent overfitting blindly. However, they decrease the convergence speed of training algorithms. Adaptive regularization schemes can solve overfitting more intelligently. They usually do not affe...
Saved in:
Published in: | arXiv.org 2021-06 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | |
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Mohammad Mahdi Bejani Ghatee, Mehdi |
description | Overfitting is one of the critical problems in deep neural networks. Many regularization schemes try to prevent overfitting blindly. However, they decrease the convergence speed of training algorithms. Adaptive regularization schemes can solve overfitting more intelligently. They usually do not affect the entire network weights. This paper detects a subset of the weighting layers that cause overfitting. The overfitting recognizes by matrix and tensor condition numbers. An adaptive regularization scheme entitled Adaptive Low-Rank (ALR) is proposed that converges a subset of the weighting layers to their Low-Rank Factorization (LRF). It happens by minimizing a new Tikhonov-based loss function. ALR also encourages lazy weights to contribute to the regularization when epochs grow up. It uses a damping sequence to increment layer selection likelihood in the last generations. Thus before falling the training accuracy, ALR reduces the lazy weights and regularizes the network substantially. The experimental results show that ALR regularizes the deep networks well with high training speed and low resource usage. |
format | article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2543468877</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2543468877</sourcerecordid><originalsourceid>FETCH-proquest_journals_25434688773</originalsourceid><addsrcrecordid>eNqNjMsKwjAQAIMgKOo_LHgu1KS1vYoPPIgHFTxK0LVNW5Oa3Vr06-3BD_A0hxmmJ4ZSqVmQRlIOxISoCMNQzhMZx2oo9OKmazYvhJ1rg4O2JRwwayrtzUezcRZawzms9KM2NoMjPhu0VyRg14XE3lwZdvrzhjOaLGcCY2GFWMMeuXW-pLHo33VFOPlxJKab9Wm5DWrvuhnxpXCNt526yDhS0TxNk0T9V30BL9hFQw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2543468877</pqid></control><display><type>article</type><title>Adaptive Low-Rank Regularization with Damping Sequences to Restrict Lazy Weights in Deep Networks</title><source>ProQuest - Publicly Available Content Database</source><creator>Mohammad Mahdi Bejani ; Ghatee, Mehdi</creator><creatorcontrib>Mohammad Mahdi Bejani ; Ghatee, Mehdi</creatorcontrib><description>Overfitting is one of the critical problems in deep neural networks. Many regularization schemes try to prevent overfitting blindly. However, they decrease the convergence speed of training algorithms. Adaptive regularization schemes can solve overfitting more intelligently. They usually do not affect the entire network weights. This paper detects a subset of the weighting layers that cause overfitting. The overfitting recognizes by matrix and tensor condition numbers. An adaptive regularization scheme entitled Adaptive Low-Rank (ALR) is proposed that converges a subset of the weighting layers to their Low-Rank Factorization (LRF). It happens by minimizing a new Tikhonov-based loss function. ALR also encourages lazy weights to contribute to the regularization when epochs grow up. It uses a damping sequence to increment layer selection likelihood in the last generations. Thus before falling the training accuracy, ALR reduces the lazy weights and regularizes the network substantially. The experimental results show that ALR regularizes the deep networks well with high training speed and low resource usage.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Adaptive algorithms ; Artificial neural networks ; Convergence ; Damping ; Regularization ; Tensors ; Weighting</subject><ispartof>arXiv.org, 2021-06</ispartof><rights>2021. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2543468877?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,37012,44590</link.rule.ids></links><search><creatorcontrib>Mohammad Mahdi Bejani</creatorcontrib><creatorcontrib>Ghatee, Mehdi</creatorcontrib><title>Adaptive Low-Rank Regularization with Damping Sequences to Restrict Lazy Weights in Deep Networks</title><title>arXiv.org</title><description>Overfitting is one of the critical problems in deep neural networks. Many regularization schemes try to prevent overfitting blindly. However, they decrease the convergence speed of training algorithms. Adaptive regularization schemes can solve overfitting more intelligently. They usually do not affect the entire network weights. This paper detects a subset of the weighting layers that cause overfitting. The overfitting recognizes by matrix and tensor condition numbers. An adaptive regularization scheme entitled Adaptive Low-Rank (ALR) is proposed that converges a subset of the weighting layers to their Low-Rank Factorization (LRF). It happens by minimizing a new Tikhonov-based loss function. ALR also encourages lazy weights to contribute to the regularization when epochs grow up. It uses a damping sequence to increment layer selection likelihood in the last generations. Thus before falling the training accuracy, ALR reduces the lazy weights and regularizes the network substantially. The experimental results show that ALR regularizes the deep networks well with high training speed and low resource usage.</description><subject>Adaptive algorithms</subject><subject>Artificial neural networks</subject><subject>Convergence</subject><subject>Damping</subject><subject>Regularization</subject><subject>Tensors</subject><subject>Weighting</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNjMsKwjAQAIMgKOo_LHgu1KS1vYoPPIgHFTxK0LVNW5Oa3Vr06-3BD_A0hxmmJ4ZSqVmQRlIOxISoCMNQzhMZx2oo9OKmazYvhJ1rg4O2JRwwayrtzUezcRZawzms9KM2NoMjPhu0VyRg14XE3lwZdvrzhjOaLGcCY2GFWMMeuXW-pLHo33VFOPlxJKab9Wm5DWrvuhnxpXCNt526yDhS0TxNk0T9V30BL9hFQw</recordid><startdate>20210617</startdate><enddate>20210617</enddate><creator>Mohammad Mahdi Bejani</creator><creator>Ghatee, Mehdi</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20210617</creationdate><title>Adaptive Low-Rank Regularization with Damping Sequences to Restrict Lazy Weights in Deep Networks</title><author>Mohammad Mahdi Bejani ; Ghatee, Mehdi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_25434688773</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Adaptive algorithms</topic><topic>Artificial neural networks</topic><topic>Convergence</topic><topic>Damping</topic><topic>Regularization</topic><topic>Tensors</topic><topic>Weighting</topic><toplevel>online_resources</toplevel><creatorcontrib>Mohammad Mahdi Bejani</creatorcontrib><creatorcontrib>Ghatee, Mehdi</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>ProQuest - Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Mohammad Mahdi Bejani</au><au>Ghatee, Mehdi</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Adaptive Low-Rank Regularization with Damping Sequences to Restrict Lazy Weights in Deep Networks</atitle><jtitle>arXiv.org</jtitle><date>2021-06-17</date><risdate>2021</risdate><eissn>2331-8422</eissn><abstract>Overfitting is one of the critical problems in deep neural networks. Many regularization schemes try to prevent overfitting blindly. However, they decrease the convergence speed of training algorithms. Adaptive regularization schemes can solve overfitting more intelligently. They usually do not affect the entire network weights. This paper detects a subset of the weighting layers that cause overfitting. The overfitting recognizes by matrix and tensor condition numbers. An adaptive regularization scheme entitled Adaptive Low-Rank (ALR) is proposed that converges a subset of the weighting layers to their Low-Rank Factorization (LRF). It happens by minimizing a new Tikhonov-based loss function. ALR also encourages lazy weights to contribute to the regularization when epochs grow up. It uses a damping sequence to increment layer selection likelihood in the last generations. Thus before falling the training accuracy, ALR reduces the lazy weights and regularizes the network substantially. The experimental results show that ALR regularizes the deep networks well with high training speed and low resource usage.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2021-06 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_2543468877 |
source | ProQuest - Publicly Available Content Database |
subjects | Adaptive algorithms Artificial neural networks Convergence Damping Regularization Tensors Weighting |
title | Adaptive Low-Rank Regularization with Damping Sequences to Restrict Lazy Weights in Deep Networks |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T11%3A39%3A30IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Adaptive%20Low-Rank%20Regularization%20with%20Damping%20Sequences%20to%20Restrict%20Lazy%20Weights%20in%20Deep%20Networks&rft.jtitle=arXiv.org&rft.au=Mohammad%20Mahdi%20Bejani&rft.date=2021-06-17&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2543468877%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_25434688773%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2543468877&rft_id=info:pmid/&rfr_iscdi=true |