Loading…

A General Descent Aggregation Framework for Gradient-Based Bi-Level Optimization

In recent years, a variety of gradient-based methods have been developed to solve Bi-Level Optimization (BLO) problems in machine learning and computer vision areas. However, the theoretical correctness and practical effectiveness of these existing approaches always rely on some restrictive conditio...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on pattern analysis and machine intelligence 2023-01, Vol.45 (1), p.38-57
Main Authors: Liu, Risheng, Mu, Pan, Yuan, Xiaoming, Zeng, Shangzhi, Zhang, Jin
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c351t-ab47fa93b0d8980625a1ab8d9a06050c478d9b1666a419f1d161b7273d34f1123
cites cdi_FETCH-LOGICAL-c351t-ab47fa93b0d8980625a1ab8d9a06050c478d9b1666a419f1d161b7273d34f1123
container_end_page 57
container_issue 1
container_start_page 38
container_title IEEE transactions on pattern analysis and machine intelligence
container_volume 45
creator Liu, Risheng
Mu, Pan
Yuan, Xiaoming
Zeng, Shangzhi
Zhang, Jin
description In recent years, a variety of gradient-based methods have been developed to solve Bi-Level Optimization (BLO) problems in machine learning and computer vision areas. However, the theoretical correctness and practical effectiveness of these existing approaches always rely on some restrictive conditions (e.g., Lower-Level Singleton, LLS), which could hardly be satisfied in real-world applications. Moreover, previous literature only proves theoretical results based on their specific iteration strategies, thus lack a general recipe to uniformly analyze the convergence behaviors of different gradient-based BLOs. In this work, we formulate BLOs from an optimistic bi-level viewpoint and establish a new gradient-based algorithmic framework, named Bi-level Descent Aggregation (BDA), to partially address the above issues. Specifically, BDA provides a modularized structure to hierarchically aggregate both the upper- and lower-level subproblems to generate our bi-level iterative dynamics. Theoretically, we establish a general convergence analysis template and derive a new proof recipe to investigate the essential theoretical properties of gradient-based BLO methods. Furthermore, this work systematically explores the convergence behavior of BDA in different optimization scenarios, i.e., considering various solution qualities (i.e., global/local/stationary solution) returned from solving approximation subproblems. Extensive experiments justify our theoretical results and demonstrate the superiority of the proposed algorithm for hyper-parameter optimization and meta-learning tasks. Source code is available at https://github.com/vis-opt-group/BDA .
doi_str_mv 10.1109/TPAMI.2022.3140249
format article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmed_primary_34982677</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9669130</ieee_id><sourcerecordid>2747610677</sourcerecordid><originalsourceid>FETCH-LOGICAL-c351t-ab47fa93b0d8980625a1ab8d9a06050c478d9b1666a419f1d161b7273d34f1123</originalsourceid><addsrcrecordid>eNpdkMtOwzAQRS0EouXxAyChSGzYpHhsx4mXbYFSqQgWZW05yaRKyaPYKQi-HpeWLljNSHPu1egQcgF0AEDV7fxl-DQdMMrYgIOgTKgD0gfFVcgjrg5Jn4JkYZKwpEdOnFtSCiKi_Jj0uFAJk3HcJy_DYIINWlMFd-gybLpguFhYXJiubJvgwZoaP1v7FhStDSbW5KVHwpFxmAejMpzhB1bB86or6_L7N3JGjgpTOTzfzVPy-nA_Hz-Gs-fJdDychRmPoAtNKuLCKJ7SPFEJlSwyYNIkV4ZKGtFMxH5PQUppBKgCcpCQxizmORcFAOOn5Gbbu7Lt-xpdp-vS_19VpsF27TSTIFUUCSk8ev0PXbZr2_jvNItFLIF6FZ5iWyqzrXMWC72yZW3slwaqN771r2-98a13vn3oale9TmvM95E_wR643AIlIu7PSkoFnPIfFDOB4g</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2747610677</pqid></control><display><type>article</type><title>A General Descent Aggregation Framework for Gradient-Based Bi-Level Optimization</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Liu, Risheng ; Mu, Pan ; Yuan, Xiaoming ; Zeng, Shangzhi ; Zhang, Jin</creator><creatorcontrib>Liu, Risheng ; Mu, Pan ; Yuan, Xiaoming ; Zeng, Shangzhi ; Zhang, Jin</creatorcontrib><description>In recent years, a variety of gradient-based methods have been developed to solve Bi-Level Optimization (BLO) problems in machine learning and computer vision areas. However, the theoretical correctness and practical effectiveness of these existing approaches always rely on some restrictive conditions (e.g., Lower-Level Singleton, LLS), which could hardly be satisfied in real-world applications. Moreover, previous literature only proves theoretical results based on their specific iteration strategies, thus lack a general recipe to uniformly analyze the convergence behaviors of different gradient-based BLOs. In this work, we formulate BLOs from an optimistic bi-level viewpoint and establish a new gradient-based algorithmic framework, named Bi-level Descent Aggregation (BDA), to partially address the above issues. Specifically, BDA provides a modularized structure to hierarchically aggregate both the upper- and lower-level subproblems to generate our bi-level iterative dynamics. Theoretically, we establish a general convergence analysis template and derive a new proof recipe to investigate the essential theoretical properties of gradient-based BLO methods. Furthermore, this work systematically explores the convergence behavior of BDA in different optimization scenarios, i.e., considering various solution qualities (i.e., global/local/stationary solution) returned from solving approximation subproblems. Extensive experiments justify our theoretical results and demonstrate the superiority of the proposed algorithm for hyper-parameter optimization and meta-learning tasks. Source code is available at https://github.com/vis-opt-group/BDA .</description><identifier>ISSN: 0162-8828</identifier><identifier>EISSN: 1939-3539</identifier><identifier>EISSN: 2160-9292</identifier><identifier>DOI: 10.1109/TPAMI.2022.3140249</identifier><identifier>PMID: 34982677</identifier><identifier>CODEN: ITPIDJ</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Agglomeration ; Algorithms ; Approximation algorithms ; Bi-level optimization ; Cognitive tasks ; Computer vision ; Convergence ; descent aggregation ; Dynamical systems ; gradient-based method ; Heuristic algorithms ; hyper-parameter optimization ; Iterative methods ; Machine learning ; meta-learning ; Optimization ; Source code ; Task analysis</subject><ispartof>IEEE transactions on pattern analysis and machine intelligence, 2023-01, Vol.45 (1), p.38-57</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c351t-ab47fa93b0d8980625a1ab8d9a06050c478d9b1666a419f1d161b7273d34f1123</citedby><cites>FETCH-LOGICAL-c351t-ab47fa93b0d8980625a1ab8d9a06050c478d9b1666a419f1d161b7273d34f1123</cites><orcidid>0000-0002-9554-0565 ; 0000-0002-5330-2673 ; 0000-0002-6691-5612 ; 0000-0002-6900-6983</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9669130$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/34982677$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Liu, Risheng</creatorcontrib><creatorcontrib>Mu, Pan</creatorcontrib><creatorcontrib>Yuan, Xiaoming</creatorcontrib><creatorcontrib>Zeng, Shangzhi</creatorcontrib><creatorcontrib>Zhang, Jin</creatorcontrib><title>A General Descent Aggregation Framework for Gradient-Based Bi-Level Optimization</title><title>IEEE transactions on pattern analysis and machine intelligence</title><addtitle>TPAMI</addtitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><description>In recent years, a variety of gradient-based methods have been developed to solve Bi-Level Optimization (BLO) problems in machine learning and computer vision areas. However, the theoretical correctness and practical effectiveness of these existing approaches always rely on some restrictive conditions (e.g., Lower-Level Singleton, LLS), which could hardly be satisfied in real-world applications. Moreover, previous literature only proves theoretical results based on their specific iteration strategies, thus lack a general recipe to uniformly analyze the convergence behaviors of different gradient-based BLOs. In this work, we formulate BLOs from an optimistic bi-level viewpoint and establish a new gradient-based algorithmic framework, named Bi-level Descent Aggregation (BDA), to partially address the above issues. Specifically, BDA provides a modularized structure to hierarchically aggregate both the upper- and lower-level subproblems to generate our bi-level iterative dynamics. Theoretically, we establish a general convergence analysis template and derive a new proof recipe to investigate the essential theoretical properties of gradient-based BLO methods. Furthermore, this work systematically explores the convergence behavior of BDA in different optimization scenarios, i.e., considering various solution qualities (i.e., global/local/stationary solution) returned from solving approximation subproblems. Extensive experiments justify our theoretical results and demonstrate the superiority of the proposed algorithm for hyper-parameter optimization and meta-learning tasks. Source code is available at https://github.com/vis-opt-group/BDA .</description><subject>Agglomeration</subject><subject>Algorithms</subject><subject>Approximation algorithms</subject><subject>Bi-level optimization</subject><subject>Cognitive tasks</subject><subject>Computer vision</subject><subject>Convergence</subject><subject>descent aggregation</subject><subject>Dynamical systems</subject><subject>gradient-based method</subject><subject>Heuristic algorithms</subject><subject>hyper-parameter optimization</subject><subject>Iterative methods</subject><subject>Machine learning</subject><subject>meta-learning</subject><subject>Optimization</subject><subject>Source code</subject><subject>Task analysis</subject><issn>0162-8828</issn><issn>1939-3539</issn><issn>2160-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNpdkMtOwzAQRS0EouXxAyChSGzYpHhsx4mXbYFSqQgWZW05yaRKyaPYKQi-HpeWLljNSHPu1egQcgF0AEDV7fxl-DQdMMrYgIOgTKgD0gfFVcgjrg5Jn4JkYZKwpEdOnFtSCiKi_Jj0uFAJk3HcJy_DYIINWlMFd-gybLpguFhYXJiubJvgwZoaP1v7FhStDSbW5KVHwpFxmAejMpzhB1bB86or6_L7N3JGjgpTOTzfzVPy-nA_Hz-Gs-fJdDychRmPoAtNKuLCKJ7SPFEJlSwyYNIkV4ZKGtFMxH5PQUppBKgCcpCQxizmORcFAOOn5Gbbu7Lt-xpdp-vS_19VpsF27TSTIFUUCSk8ev0PXbZr2_jvNItFLIF6FZ5iWyqzrXMWC72yZW3slwaqN771r2-98a13vn3oale9TmvM95E_wR643AIlIu7PSkoFnPIfFDOB4g</recordid><startdate>20230101</startdate><enddate>20230101</enddate><creator>Liu, Risheng</creator><creator>Mu, Pan</creator><creator>Yuan, Xiaoming</creator><creator>Zeng, Shangzhi</creator><creator>Zhang, Jin</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-9554-0565</orcidid><orcidid>https://orcid.org/0000-0002-5330-2673</orcidid><orcidid>https://orcid.org/0000-0002-6691-5612</orcidid><orcidid>https://orcid.org/0000-0002-6900-6983</orcidid></search><sort><creationdate>20230101</creationdate><title>A General Descent Aggregation Framework for Gradient-Based Bi-Level Optimization</title><author>Liu, Risheng ; Mu, Pan ; Yuan, Xiaoming ; Zeng, Shangzhi ; Zhang, Jin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c351t-ab47fa93b0d8980625a1ab8d9a06050c478d9b1666a419f1d161b7273d34f1123</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Agglomeration</topic><topic>Algorithms</topic><topic>Approximation algorithms</topic><topic>Bi-level optimization</topic><topic>Cognitive tasks</topic><topic>Computer vision</topic><topic>Convergence</topic><topic>descent aggregation</topic><topic>Dynamical systems</topic><topic>gradient-based method</topic><topic>Heuristic algorithms</topic><topic>hyper-parameter optimization</topic><topic>Iterative methods</topic><topic>Machine learning</topic><topic>meta-learning</topic><topic>Optimization</topic><topic>Source code</topic><topic>Task analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Liu, Risheng</creatorcontrib><creatorcontrib>Mu, Pan</creatorcontrib><creatorcontrib>Yuan, Xiaoming</creatorcontrib><creatorcontrib>Zeng, Shangzhi</creatorcontrib><creatorcontrib>Zhang, Jin</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Xplore</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Liu, Risheng</au><au>Mu, Pan</au><au>Yuan, Xiaoming</au><au>Zeng, Shangzhi</au><au>Zhang, Jin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A General Descent Aggregation Framework for Gradient-Based Bi-Level Optimization</atitle><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle><stitle>TPAMI</stitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><date>2023-01-01</date><risdate>2023</risdate><volume>45</volume><issue>1</issue><spage>38</spage><epage>57</epage><pages>38-57</pages><issn>0162-8828</issn><eissn>1939-3539</eissn><eissn>2160-9292</eissn><coden>ITPIDJ</coden><abstract>In recent years, a variety of gradient-based methods have been developed to solve Bi-Level Optimization (BLO) problems in machine learning and computer vision areas. However, the theoretical correctness and practical effectiveness of these existing approaches always rely on some restrictive conditions (e.g., Lower-Level Singleton, LLS), which could hardly be satisfied in real-world applications. Moreover, previous literature only proves theoretical results based on their specific iteration strategies, thus lack a general recipe to uniformly analyze the convergence behaviors of different gradient-based BLOs. In this work, we formulate BLOs from an optimistic bi-level viewpoint and establish a new gradient-based algorithmic framework, named Bi-level Descent Aggregation (BDA), to partially address the above issues. Specifically, BDA provides a modularized structure to hierarchically aggregate both the upper- and lower-level subproblems to generate our bi-level iterative dynamics. Theoretically, we establish a general convergence analysis template and derive a new proof recipe to investigate the essential theoretical properties of gradient-based BLO methods. Furthermore, this work systematically explores the convergence behavior of BDA in different optimization scenarios, i.e., considering various solution qualities (i.e., global/local/stationary solution) returned from solving approximation subproblems. Extensive experiments justify our theoretical results and demonstrate the superiority of the proposed algorithm for hyper-parameter optimization and meta-learning tasks. Source code is available at https://github.com/vis-opt-group/BDA .</abstract><cop>United States</cop><pub>IEEE</pub><pmid>34982677</pmid><doi>10.1109/TPAMI.2022.3140249</doi><tpages>20</tpages><orcidid>https://orcid.org/0000-0002-9554-0565</orcidid><orcidid>https://orcid.org/0000-0002-5330-2673</orcidid><orcidid>https://orcid.org/0000-0002-6691-5612</orcidid><orcidid>https://orcid.org/0000-0002-6900-6983</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0162-8828
ispartof IEEE transactions on pattern analysis and machine intelligence, 2023-01, Vol.45 (1), p.38-57
issn 0162-8828
1939-3539
2160-9292
language eng
recordid cdi_pubmed_primary_34982677
source IEEE Electronic Library (IEL) Journals
subjects Agglomeration
Algorithms
Approximation algorithms
Bi-level optimization
Cognitive tasks
Computer vision
Convergence
descent aggregation
Dynamical systems
gradient-based method
Heuristic algorithms
hyper-parameter optimization
Iterative methods
Machine learning
meta-learning
Optimization
Source code
Task analysis
title A General Descent Aggregation Framework for Gradient-Based Bi-Level Optimization
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T17%3A12%3A56IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20General%20Descent%20Aggregation%20Framework%20for%20Gradient-Based%20Bi-Level%20Optimization&rft.jtitle=IEEE%20transactions%20on%20pattern%20analysis%20and%20machine%20intelligence&rft.au=Liu,%20Risheng&rft.date=2023-01-01&rft.volume=45&rft.issue=1&rft.spage=38&rft.epage=57&rft.pages=38-57&rft.issn=0162-8828&rft.eissn=1939-3539&rft.coden=ITPIDJ&rft_id=info:doi/10.1109/TPAMI.2022.3140249&rft_dat=%3Cproquest_pubme%3E2747610677%3C/proquest_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c351t-ab47fa93b0d8980625a1ab8d9a06050c478d9b1666a419f1d161b7273d34f1123%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2747610677&rft_id=info:pmid/34982677&rft_ieee_id=9669130&rfr_iscdi=true