Loading…

A General Descent Aggregation Framework for Gradient-Based Bi-Level Optimization

In recent years, a variety of gradient-based methods have been developed to solve Bi-Level Optimization (BLO) problems in machine learning and computer vision areas. However, the theoretical correctness and practical effectiveness of these existing approaches always rely on some restrictive conditio...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on pattern analysis and machine intelligence 2023-01, Vol.45 (1), p.38-57
Main Authors:	Liu, Risheng, Mu, Pan, Yuan, Xiaoming, Zeng, Shangzhi, Zhang, Jin
Format:	Article
Language:	English
Subjects:	Agglomeration Algorithms Approximation algorithms Bi-level optimization Cognitive tasks Computer vision Convergence descent aggregation Dynamical systems gradient-based method Heuristic algorithms hyper-parameter optimization Iterative methods Machine learning meta-learning Optimization Source code Task analysis
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by	cdi_FETCH-LOGICAL-c351t-ab47fa93b0d8980625a1ab8d9a06050c478d9b1666a419f1d161b7273d34f1123
cites	cdi_FETCH-LOGICAL-c351t-ab47fa93b0d8980625a1ab8d9a06050c478d9b1666a419f1d161b7273d34f1123
container_end_page	57
container_issue	1
container_start_page	38
container_title	IEEE transactions on pattern analysis and machine intelligence
container_volume	45
creator	Liu, Risheng Mu, Pan Yuan, Xiaoming Zeng, Shangzhi Zhang, Jin
description	In recent years, a variety of gradient-based methods have been developed to solve Bi-Level Optimization (BLO) problems in machine learning and computer vision areas. However, the theoretical correctness and practical effectiveness of these existing approaches always rely on some restrictive conditions (e.g., Lower-Level Singleton, LLS), which could hardly be satisfied in real-world applications. Moreover, previous literature only proves theoretical results based on their specific iteration strategies, thus lack a general recipe to uniformly analyze the convergence behaviors of different gradient-based BLOs. In this work, we formulate BLOs from an optimistic bi-level viewpoint and establish a new gradient-based algorithmic framework, named Bi-level Descent Aggregation (BDA), to partially address the above issues. Specifically, BDA provides a modularized structure to hierarchically aggregate both the upper- and lower-level subproblems to generate our bi-level iterative dynamics. Theoretically, we establish a general convergence analysis template and derive a new proof recipe to investigate the essential theoretical properties of gradient-based BLO methods. Furthermore, this work systematically explores the convergence behavior of BDA in different optimization scenarios, i.e., considering various solution qualities (i.e., global/local/stationary solution) returned from solving approximation subproblems. Extensive experiments justify our theoretical results and demonstrate the superiority of the proposed algorithm for hyper-parameter optimization and meta-learning tasks. Source code is available at https://github.com/vis-opt-group/BDA .
doi_str_mv	10.1109/TPAMI.2022.3140249
format	article
fullrecord	<record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmed_primary_34982677</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9669130</ieee_id><sourcerecordid>2747610677</sourcerecordid><originalsourceid>FETCH-LOGICAL-c351t-ab47fa93b0d8980625a1ab8d9a06050c478d9b1666a419f1d161b7273d34f1123</originalsourceid><addsrcrecordid>eNpdkMtOwzAQRS0EouXxAyChSGzYpHhsx4mXbYFSqQgWZW05yaRKyaPYKQi-HpeWLljNSHPu1egQcgF0AEDV7fxl-DQdMMrYgIOgTKgD0gfFVcgjrg5Jn4JkYZKwpEdOnFtSCiKi_Jj0uFAJk3HcJy_DYIINWlMFd-gybLpguFhYXJiubJvgwZoaP1v7FhStDSbW5KVHwpFxmAejMpzhB1bB86or6_L7N3JGjgpTOTzfzVPy-nA_Hz-Gs-fJdDychRmPoAtNKuLCKJ7SPFEJlSwyYNIkV4ZKGtFMxH5PQUppBKgCcpCQxizmORcFAOOn5Gbbu7Lt-xpdp-vS_19VpsF27TSTIFUUCSk8ev0PXbZr2_jvNItFLIF6FZ5iWyqzrXMWC72yZW3slwaqN771r2-98a13vn3oale9TmvM95E_wR643AIlIu7PSkoFnPIfFDOB4g</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2747610677</pqid></control><display><type>article</type><title>A General Descent Aggregation Framework for Gradient-Based Bi-Level Optimization</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Liu, Risheng ; Mu, Pan ; Yuan, Xiaoming ; Zeng, Shangzhi ; Zhang, Jin</creator><creatorcontrib>Liu, Risheng ; Mu, Pan ; Yuan, Xiaoming ; Zeng, Shangzhi ; Zhang, Jin</creatorcontrib><description>In recent years, a variety of gradient-based methods have been developed to solve Bi-Level Optimization (BLO) problems in machine learning and computer vision areas. However, the theoretical correctness and practical effectiveness of these existing approaches always rely on some restrictive conditions (e.g., Lower-Level Singleton, LLS), which could hardly be satisfied in real-world applications. Moreover, previous literature only proves theoretical results based on their specific iteration strategies, thus lack a general recipe to uniformly analyze the convergence behaviors of different gradient-based BLOs. In this work, we formulate BLOs from an optimistic bi-level viewpoint and establish a new gradient-based algorithmic framework, named Bi-level Descent Aggregation (BDA), to partially address the above issues. Specifically, BDA provides a modularized structure to hierarchically aggregate both the upper- and lower-level subproblems to generate our bi-level iterative dynamics. Theoretically, we establish a general convergence analysis template and derive a new proof recipe to investigate the essential theoretical properties of gradient-based BLO methods. Furthermore, this work systematically explores the convergence behavior of BDA in different optimization scenarios, i.e., considering various solution qualities (i.e., global/local/stationary solution) returned from solving approximation subproblems. Extensive experiments justify our theoretical results and demonstrate the superiority of the proposed algorithm for hyper-parameter optimization and meta-learning tasks. Source code is available at https://github.com/vis-opt-group/BDA .</description><identifier>ISSN: 0162-8828</identifier><identifier>EISSN: 1939-3539</identifier><identifier>EISSN: 2160-9292</identifier><identifier>DOI: 10.1109/TPAMI.2022.3140249</identifier><identifier>PMID: 34982677</identifier><identifier>CODEN: ITPIDJ</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Agglomeration ; Algorithms ; Approximation algorithms ; Bi-level optimization ; Cognitive tasks ; Computer vision ; Convergence ; descent aggregation ; Dynamical systems ; gradient-based method ; Heuristic algorithms ; hyper-parameter optimization ; Iterative methods ; Machine learning ; meta-learning ; Optimization ; Source code ; Task analysis</subject><ispartof>IEEE transactions on pattern analysis and machine intelligence, 2023-01, Vol.45 (1), p.38-57</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c351t-ab47fa93b0d8980625a1ab8d9a06050c478d9b1666a419f1d161b7273d34f1123</citedby><cites>FETCH-LOGICAL-c351t-ab47fa93b0d8980625a1ab8d9a06050c478d9b1666a419f1d161b7273d34f1123</cites><orcidid>0000-0002-9554-0565 ; 0000-0002-5330-2673 ; 0000-0002-6691-5612 ; 0000-0002-6900-6983</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9669130$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/34982677$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Liu, Risheng</creatorcontrib><creatorcontrib>Mu, Pan</creatorcontrib><creatorcontrib>Yuan, Xiaoming</creatorcontrib><creatorcontrib>Zeng, Shangzhi</creatorcontrib><creatorcontrib>Zhang, Jin</creatorcontrib><title>A General Descent Aggregation Framework for Gradient-Based Bi-Level Optimization</title><title>IEEE transactions on pattern analysis and machine intelligence</title><addtitle>TPAMI</addtitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><description>In recent years, a variety of gradient-based methods have been developed to solve Bi-Level Optimization (BLO) problems in machine learning and computer vision areas. However, the theoretical correctness and practical effectiveness of these existing approaches always rely on some restrictive conditions (e.g., Lower-Level Singleton, LLS), which could hardly be satisfied in real-world applications. Moreover, previous literature only proves theoretical results based on their specific iteration strategies, thus lack a general recipe to uniformly analyze the convergence behaviors of different gradient-based BLOs. In this work, we formulate BLOs from an optimistic bi-level viewpoint and establish a new gradient-based algorithmic framework, named Bi-level Descent Aggregation (BDA), to partially address the above issues. Specifically, BDA provides a modularized structure to hierarchically aggregate both the upper- and lower-level subproblems to generate our bi-level iterative dynamics. Theoretically, we establish a general convergence analysis template and derive a new proof recipe to investigate the essential theoretical properties of gradient-based BLO methods. Furthermore, this work systematically explores the convergence behavior of BDA in different optimization scenarios, i.e., considering various solution qualities (i.e., global/local/stationary solution) returned from solving approximation subproblems. Extensive experiments justify our theoretical results and demonstrate the superiority of the proposed algorithm for hyper-parameter optimization and meta-learning tasks. Source code is available at https://github.com/vis-opt-group/BDA .</description><subject>Agglomeration</subject><subject>Algorithms</subject><subject>Approximation algorithms</subject><subject>Bi-level optimization</subject><subject>Cognitive tasks</subject><subject>Computer vision</subject><subject>Convergence</subject><subject>descent aggregation</subject><subject>Dynamical systems</subject><subject>gradient-based method</subject><subject>Heuristic algorithms</subject><subject>hyper-parameter optimization</subject><subject>Iterative methods</subject><subject>Machine learning</subject><subject>meta-learning</subject><subject>Optimization</subject><subject>Source code</subject><subject>Task analysis</subject><issn>0162-8828</issn><issn>1939-3539</issn><issn>2160-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNpdkMtOwzAQRS0EouXxAyChSGzYpHhsx4mXbYFSqQgWZW05yaRKyaPYKQi-HpeWLljNSHPu1egQcgF0AEDV7fxl-DQdMMrYgIOgTKgD0gfFVcgjrg5Jn4JkYZKwpEdOnFtSCiKi_Jj0uFAJk3HcJy_DYIINWlMFd-gybLpguFhYXJiubJvgwZoaP1v7FhStDSbW5KVHwpFxmAejMpzhB1bB86or6_L7N3JGjgpTOTzfzVPy-nA_Hz-Gs-fJdDychRmPoAtNKuLCKJ7SPFEJlSwyYNIkV4ZKGtFMxH5PQUppBKgCcpCQxizmORcFAOOn5Gbbu7Lt-xpdp-vS_19VpsF27TSTIFUUCSk8ev0PXbZr2_jvNItFLIF6FZ5iWyqzrXMWC72yZW3slwaqN771r2-98a13vn3oale9TmvM95E_wR643AIlIu7PSkoFnPIfFDOB4g</recordid><startdate>20230101</startdate><enddate>20230101</enddate><creator>Liu, Risheng</creator><creator>Mu, Pan</creator><creator>Yuan, Xiaoming</creator><creator>Zeng, Shangzhi</creator><creator>Zhang, Jin</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-9554-0565</orcidid><orcidid>https://orcid.org/0000-0002-5330-2673</orcidid><orcidid>https://orcid.org/0000-0002-6691-5612</orcidid><orcidid>https://orcid.org/0000-0002-6900-6983</orcidid></search><sort><creationdate>20230101</creationdate><title>A General Descent Aggregation Framework for Gradient-Based Bi-Level Optimization</title><author>Liu, Risheng ; Mu, Pan ; Yuan, Xiaoming ; Zeng, Shangzhi ; Zhang, Jin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c351t-ab47fa93b0d8980625a1ab8d9a06050c478d9b1666a419f1d161b7273d34f1123</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Agglomeration</topic><topic>Algorithms</topic><topic>Approximation algorithms</topic><topic>Bi-level optimization</topic><topic>Cognitive tasks</topic><topic>Computer vision</topic><topic>Convergence</topic><topic>descent aggregation</topic><topic>Dynamical systems</topic><topic>gradient-based method</topic><topic>Heuristic algorithms</topic><topic>hyper-parameter optimization</topic><topic>Iterative methods</topic><topic>Machine learning</topic><topic>meta-learning</topic><topic>Optimization</topic><topic>Source code</topic><topic>Task analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Liu, Risheng</creatorcontrib><creatorcontrib>Mu, Pan</creatorcontrib><creatorcontrib>Yuan, Xiaoming</creatorcontrib><creatorcontrib>Zeng, Shangzhi</creatorcontrib><creatorcontrib>Zhang, Jin</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Xplore</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Liu, Risheng</au><au>Mu, Pan</au><au>Yuan, Xiaoming</au><au>Zeng, Shangzhi</au><au>Zhang, Jin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A General Descent Aggregation Framework for Gradient-Based Bi-Level Optimization</atitle><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle><stitle>TPAMI</stitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><date>2023-01-01</date><risdate>2023</risdate><volume>45</volume><issue>1</issue><spage>38</spage><epage>57</epage><pages>38-57</pages><issn>0162-8828</issn><eissn>1939-3539</eissn><eissn>2160-9292</eissn><coden>ITPIDJ</coden><abstract>In recent years, a variety of gradient-based methods have been developed to solve Bi-Level Optimization (BLO) problems in machine learning and computer vision areas. However, the theoretical correctness and practical effectiveness of these existing approaches always rely on some restrictive conditions (e.g., Lower-Level Singleton, LLS), which could hardly be satisfied in real-world applications. Moreover, previous literature only proves theoretical results based on their specific iteration strategies, thus lack a general recipe to uniformly analyze the convergence behaviors of different gradient-based BLOs. In this work, we formulate BLOs from an optimistic bi-level viewpoint and establish a new gradient-based algorithmic framework, named Bi-level Descent Aggregation (BDA), to partially address the above issues. Specifically, BDA provides a modularized structure to hierarchically aggregate both the upper- and lower-level subproblems to generate our bi-level iterative dynamics. Theoretically, we establish a general convergence analysis template and derive a new proof recipe to investigate the essential theoretical properties of gradient-based BLO methods. Furthermore, this work systematically explores the convergence behavior of BDA in different optimization scenarios, i.e., considering various solution qualities (i.e., global/local/stationary solution) returned from solving approximation subproblems. Extensive experiments justify our theoretical results and demonstrate the superiority of the proposed algorithm for hyper-parameter optimization and meta-learning tasks. Source code is available at https://github.com/vis-opt-group/BDA .</abstract><cop>United States</cop><pub>IEEE</pub><pmid>34982677</pmid><doi>10.1109/TPAMI.2022.3140249</doi><tpages>20</tpages><orcidid>https://orcid.org/0000-0002-9554-0565</orcidid><orcidid>https://orcid.org/0000-0002-5330-2673</orcidid><orcidid>https://orcid.org/0000-0002-6691-5612</orcidid><orcidid>https://orcid.org/0000-0002-6900-6983</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 0162-8828
ispartof	IEEE transactions on pattern analysis and machine intelligence, 2023-01, Vol.45 (1), p.38-57
issn	0162-8828 1939-3539 2160-9292
language	eng
recordid	cdi_pubmed_primary_34982677
source	IEEE Electronic Library (IEL) Journals
subjects	Agglomeration Algorithms Approximation algorithms Bi-level optimization Cognitive tasks Computer vision Convergence descent aggregation Dynamical systems gradient-based method Heuristic algorithms hyper-parameter optimization Iterative methods Machine learning meta-learning Optimization Source code Task analysis
title	A General Descent Aggregation Framework for Gradient-Based Bi-Level Optimization
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T17%3A12%3A56IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20General%20Descent%20Aggregation%20Framework%20for%20Gradient-Based%20Bi-Level%20Optimization&rft.jtitle=IEEE%20transactions%20on%20pattern%20analysis%20and%20machine%20intelligence&rft.au=Liu,%20Risheng&rft.date=2023-01-01&rft.volume=45&rft.issue=1&rft.spage=38&rft.epage=57&rft.pages=38-57&rft.issn=0162-8828&rft.eissn=1939-3539&rft.coden=ITPIDJ&rft_id=info:doi/10.1109/TPAMI.2022.3140249&rft_dat=%3Cproquest_pubme%3E2747610677%3C/proquest_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c351t-ab47fa93b0d8980625a1ab8d9a06050c478d9b1666a419f1d161b7273d34f1123%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2747610677&rft_id=info:pmid/34982677&rft_ieee_id=9669130&rfr_iscdi=true