Loading…
A General Descent Aggregation Framework for Gradient-Based Bi-Level Optimization
In recent years, a variety of gradient-based methods have been developed to solve Bi-Level Optimization (BLO) problems in machine learning and computer vision areas. However, the theoretical correctness and practical effectiveness of these existing approaches always rely on some restrictive conditio...
Saved in:
Published in: | IEEE transactions on pattern analysis and machine intelligence 2023-01, Vol.45 (1), p.38-57 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c351t-ab47fa93b0d8980625a1ab8d9a06050c478d9b1666a419f1d161b7273d34f1123 |
---|---|
cites | cdi_FETCH-LOGICAL-c351t-ab47fa93b0d8980625a1ab8d9a06050c478d9b1666a419f1d161b7273d34f1123 |
container_end_page | 57 |
container_issue | 1 |
container_start_page | 38 |
container_title | IEEE transactions on pattern analysis and machine intelligence |
container_volume | 45 |
creator | Liu, Risheng Mu, Pan Yuan, Xiaoming Zeng, Shangzhi Zhang, Jin |
description | In recent years, a variety of gradient-based methods have been developed to solve Bi-Level Optimization (BLO) problems in machine learning and computer vision areas. However, the theoretical correctness and practical effectiveness of these existing approaches always rely on some restrictive conditions (e.g., Lower-Level Singleton, LLS), which could hardly be satisfied in real-world applications. Moreover, previous literature only proves theoretical results based on their specific iteration strategies, thus lack a general recipe to uniformly analyze the convergence behaviors of different gradient-based BLOs. In this work, we formulate BLOs from an optimistic bi-level viewpoint and establish a new gradient-based algorithmic framework, named Bi-level Descent Aggregation (BDA), to partially address the above issues. Specifically, BDA provides a modularized structure to hierarchically aggregate both the upper- and lower-level subproblems to generate our bi-level iterative dynamics. Theoretically, we establish a general convergence analysis template and derive a new proof recipe to investigate the essential theoretical properties of gradient-based BLO methods. Furthermore, this work systematically explores the convergence behavior of BDA in different optimization scenarios, i.e., considering various solution qualities (i.e., global/local/stationary solution) returned from solving approximation subproblems. Extensive experiments justify our theoretical results and demonstrate the superiority of the proposed algorithm for hyper-parameter optimization and meta-learning tasks. Source code is available at https://github.com/vis-opt-group/BDA . |
doi_str_mv | 10.1109/TPAMI.2022.3140249 |
format | article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmed_primary_34982677</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9669130</ieee_id><sourcerecordid>2747610677</sourcerecordid><originalsourceid>FETCH-LOGICAL-c351t-ab47fa93b0d8980625a1ab8d9a06050c478d9b1666a419f1d161b7273d34f1123</originalsourceid><addsrcrecordid>eNpdkMtOwzAQRS0EouXxAyChSGzYpHhsx4mXbYFSqQgWZW05yaRKyaPYKQi-HpeWLljNSHPu1egQcgF0AEDV7fxl-DQdMMrYgIOgTKgD0gfFVcgjrg5Jn4JkYZKwpEdOnFtSCiKi_Jj0uFAJk3HcJy_DYIINWlMFd-gybLpguFhYXJiubJvgwZoaP1v7FhStDSbW5KVHwpFxmAejMpzhB1bB86or6_L7N3JGjgpTOTzfzVPy-nA_Hz-Gs-fJdDychRmPoAtNKuLCKJ7SPFEJlSwyYNIkV4ZKGtFMxH5PQUppBKgCcpCQxizmORcFAOOn5Gbbu7Lt-xpdp-vS_19VpsF27TSTIFUUCSk8ev0PXbZr2_jvNItFLIF6FZ5iWyqzrXMWC72yZW3slwaqN771r2-98a13vn3oale9TmvM95E_wR643AIlIu7PSkoFnPIfFDOB4g</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2747610677</pqid></control><display><type>article</type><title>A General Descent Aggregation Framework for Gradient-Based Bi-Level Optimization</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Liu, Risheng ; Mu, Pan ; Yuan, Xiaoming ; Zeng, Shangzhi ; Zhang, Jin</creator><creatorcontrib>Liu, Risheng ; Mu, Pan ; Yuan, Xiaoming ; Zeng, Shangzhi ; Zhang, Jin</creatorcontrib><description>In recent years, a variety of gradient-based methods have been developed to solve Bi-Level Optimization (BLO) problems in machine learning and computer vision areas. However, the theoretical correctness and practical effectiveness of these existing approaches always rely on some restrictive conditions (e.g., Lower-Level Singleton, LLS), which could hardly be satisfied in real-world applications. Moreover, previous literature only proves theoretical results based on their specific iteration strategies, thus lack a general recipe to uniformly analyze the convergence behaviors of different gradient-based BLOs. In this work, we formulate BLOs from an optimistic bi-level viewpoint and establish a new gradient-based algorithmic framework, named Bi-level Descent Aggregation (BDA), to partially address the above issues. Specifically, BDA provides a modularized structure to hierarchically aggregate both the upper- and lower-level subproblems to generate our bi-level iterative dynamics. Theoretically, we establish a general convergence analysis template and derive a new proof recipe to investigate the essential theoretical properties of gradient-based BLO methods. Furthermore, this work systematically explores the convergence behavior of BDA in different optimization scenarios, i.e., considering various solution qualities (i.e., global/local/stationary solution) returned from solving approximation subproblems. Extensive experiments justify our theoretical results and demonstrate the superiority of the proposed algorithm for hyper-parameter optimization and meta-learning tasks. Source code is available at https://github.com/vis-opt-group/BDA .</description><identifier>ISSN: 0162-8828</identifier><identifier>EISSN: 1939-3539</identifier><identifier>EISSN: 2160-9292</identifier><identifier>DOI: 10.1109/TPAMI.2022.3140249</identifier><identifier>PMID: 34982677</identifier><identifier>CODEN: ITPIDJ</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Agglomeration ; Algorithms ; Approximation algorithms ; Bi-level optimization ; Cognitive tasks ; Computer vision ; Convergence ; descent aggregation ; Dynamical systems ; gradient-based method ; Heuristic algorithms ; hyper-parameter optimization ; Iterative methods ; Machine learning ; meta-learning ; Optimization ; Source code ; Task analysis</subject><ispartof>IEEE transactions on pattern analysis and machine intelligence, 2023-01, Vol.45 (1), p.38-57</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c351t-ab47fa93b0d8980625a1ab8d9a06050c478d9b1666a419f1d161b7273d34f1123</citedby><cites>FETCH-LOGICAL-c351t-ab47fa93b0d8980625a1ab8d9a06050c478d9b1666a419f1d161b7273d34f1123</cites><orcidid>0000-0002-9554-0565 ; 0000-0002-5330-2673 ; 0000-0002-6691-5612 ; 0000-0002-6900-6983</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9669130$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/34982677$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Liu, Risheng</creatorcontrib><creatorcontrib>Mu, Pan</creatorcontrib><creatorcontrib>Yuan, Xiaoming</creatorcontrib><creatorcontrib>Zeng, Shangzhi</creatorcontrib><creatorcontrib>Zhang, Jin</creatorcontrib><title>A General Descent Aggregation Framework for Gradient-Based Bi-Level Optimization</title><title>IEEE transactions on pattern analysis and machine intelligence</title><addtitle>TPAMI</addtitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><description>In recent years, a variety of gradient-based methods have been developed to solve Bi-Level Optimization (BLO) problems in machine learning and computer vision areas. However, the theoretical correctness and practical effectiveness of these existing approaches always rely on some restrictive conditions (e.g., Lower-Level Singleton, LLS), which could hardly be satisfied in real-world applications. Moreover, previous literature only proves theoretical results based on their specific iteration strategies, thus lack a general recipe to uniformly analyze the convergence behaviors of different gradient-based BLOs. In this work, we formulate BLOs from an optimistic bi-level viewpoint and establish a new gradient-based algorithmic framework, named Bi-level Descent Aggregation (BDA), to partially address the above issues. Specifically, BDA provides a modularized structure to hierarchically aggregate both the upper- and lower-level subproblems to generate our bi-level iterative dynamics. Theoretically, we establish a general convergence analysis template and derive a new proof recipe to investigate the essential theoretical properties of gradient-based BLO methods. Furthermore, this work systematically explores the convergence behavior of BDA in different optimization scenarios, i.e., considering various solution qualities (i.e., global/local/stationary solution) returned from solving approximation subproblems. Extensive experiments justify our theoretical results and demonstrate the superiority of the proposed algorithm for hyper-parameter optimization and meta-learning tasks. Source code is available at https://github.com/vis-opt-group/BDA .</description><subject>Agglomeration</subject><subject>Algorithms</subject><subject>Approximation algorithms</subject><subject>Bi-level optimization</subject><subject>Cognitive tasks</subject><subject>Computer vision</subject><subject>Convergence</subject><subject>descent aggregation</subject><subject>Dynamical systems</subject><subject>gradient-based method</subject><subject>Heuristic algorithms</subject><subject>hyper-parameter optimization</subject><subject>Iterative methods</subject><subject>Machine learning</subject><subject>meta-learning</subject><subject>Optimization</subject><subject>Source code</subject><subject>Task analysis</subject><issn>0162-8828</issn><issn>1939-3539</issn><issn>2160-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNpdkMtOwzAQRS0EouXxAyChSGzYpHhsx4mXbYFSqQgWZW05yaRKyaPYKQi-HpeWLljNSHPu1egQcgF0AEDV7fxl-DQdMMrYgIOgTKgD0gfFVcgjrg5Jn4JkYZKwpEdOnFtSCiKi_Jj0uFAJk3HcJy_DYIINWlMFd-gybLpguFhYXJiubJvgwZoaP1v7FhStDSbW5KVHwpFxmAejMpzhB1bB86or6_L7N3JGjgpTOTzfzVPy-nA_Hz-Gs-fJdDychRmPoAtNKuLCKJ7SPFEJlSwyYNIkV4ZKGtFMxH5PQUppBKgCcpCQxizmORcFAOOn5Gbbu7Lt-xpdp-vS_19VpsF27TSTIFUUCSk8ev0PXbZr2_jvNItFLIF6FZ5iWyqzrXMWC72yZW3slwaqN771r2-98a13vn3oale9TmvM95E_wR643AIlIu7PSkoFnPIfFDOB4g</recordid><startdate>20230101</startdate><enddate>20230101</enddate><creator>Liu, Risheng</creator><creator>Mu, Pan</creator><creator>Yuan, Xiaoming</creator><creator>Zeng, Shangzhi</creator><creator>Zhang, Jin</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-9554-0565</orcidid><orcidid>https://orcid.org/0000-0002-5330-2673</orcidid><orcidid>https://orcid.org/0000-0002-6691-5612</orcidid><orcidid>https://orcid.org/0000-0002-6900-6983</orcidid></search><sort><creationdate>20230101</creationdate><title>A General Descent Aggregation Framework for Gradient-Based Bi-Level Optimization</title><author>Liu, Risheng ; Mu, Pan ; Yuan, Xiaoming ; Zeng, Shangzhi ; Zhang, Jin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c351t-ab47fa93b0d8980625a1ab8d9a06050c478d9b1666a419f1d161b7273d34f1123</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Agglomeration</topic><topic>Algorithms</topic><topic>Approximation algorithms</topic><topic>Bi-level optimization</topic><topic>Cognitive tasks</topic><topic>Computer vision</topic><topic>Convergence</topic><topic>descent aggregation</topic><topic>Dynamical systems</topic><topic>gradient-based method</topic><topic>Heuristic algorithms</topic><topic>hyper-parameter optimization</topic><topic>Iterative methods</topic><topic>Machine learning</topic><topic>meta-learning</topic><topic>Optimization</topic><topic>Source code</topic><topic>Task analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Liu, Risheng</creatorcontrib><creatorcontrib>Mu, Pan</creatorcontrib><creatorcontrib>Yuan, Xiaoming</creatorcontrib><creatorcontrib>Zeng, Shangzhi</creatorcontrib><creatorcontrib>Zhang, Jin</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Xplore</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Liu, Risheng</au><au>Mu, Pan</au><au>Yuan, Xiaoming</au><au>Zeng, Shangzhi</au><au>Zhang, Jin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A General Descent Aggregation Framework for Gradient-Based Bi-Level Optimization</atitle><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle><stitle>TPAMI</stitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><date>2023-01-01</date><risdate>2023</risdate><volume>45</volume><issue>1</issue><spage>38</spage><epage>57</epage><pages>38-57</pages><issn>0162-8828</issn><eissn>1939-3539</eissn><eissn>2160-9292</eissn><coden>ITPIDJ</coden><abstract>In recent years, a variety of gradient-based methods have been developed to solve Bi-Level Optimization (BLO) problems in machine learning and computer vision areas. However, the theoretical correctness and practical effectiveness of these existing approaches always rely on some restrictive conditions (e.g., Lower-Level Singleton, LLS), which could hardly be satisfied in real-world applications. Moreover, previous literature only proves theoretical results based on their specific iteration strategies, thus lack a general recipe to uniformly analyze the convergence behaviors of different gradient-based BLOs. In this work, we formulate BLOs from an optimistic bi-level viewpoint and establish a new gradient-based algorithmic framework, named Bi-level Descent Aggregation (BDA), to partially address the above issues. Specifically, BDA provides a modularized structure to hierarchically aggregate both the upper- and lower-level subproblems to generate our bi-level iterative dynamics. Theoretically, we establish a general convergence analysis template and derive a new proof recipe to investigate the essential theoretical properties of gradient-based BLO methods. Furthermore, this work systematically explores the convergence behavior of BDA in different optimization scenarios, i.e., considering various solution qualities (i.e., global/local/stationary solution) returned from solving approximation subproblems. Extensive experiments justify our theoretical results and demonstrate the superiority of the proposed algorithm for hyper-parameter optimization and meta-learning tasks. Source code is available at https://github.com/vis-opt-group/BDA .</abstract><cop>United States</cop><pub>IEEE</pub><pmid>34982677</pmid><doi>10.1109/TPAMI.2022.3140249</doi><tpages>20</tpages><orcidid>https://orcid.org/0000-0002-9554-0565</orcidid><orcidid>https://orcid.org/0000-0002-5330-2673</orcidid><orcidid>https://orcid.org/0000-0002-6691-5612</orcidid><orcidid>https://orcid.org/0000-0002-6900-6983</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0162-8828 |
ispartof | IEEE transactions on pattern analysis and machine intelligence, 2023-01, Vol.45 (1), p.38-57 |
issn | 0162-8828 1939-3539 2160-9292 |
language | eng |
recordid | cdi_pubmed_primary_34982677 |
source | IEEE Electronic Library (IEL) Journals |
subjects | Agglomeration Algorithms Approximation algorithms Bi-level optimization Cognitive tasks Computer vision Convergence descent aggregation Dynamical systems gradient-based method Heuristic algorithms hyper-parameter optimization Iterative methods Machine learning meta-learning Optimization Source code Task analysis |
title | A General Descent Aggregation Framework for Gradient-Based Bi-Level Optimization |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T17%3A12%3A56IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20General%20Descent%20Aggregation%20Framework%20for%20Gradient-Based%20Bi-Level%20Optimization&rft.jtitle=IEEE%20transactions%20on%20pattern%20analysis%20and%20machine%20intelligence&rft.au=Liu,%20Risheng&rft.date=2023-01-01&rft.volume=45&rft.issue=1&rft.spage=38&rft.epage=57&rft.pages=38-57&rft.issn=0162-8828&rft.eissn=1939-3539&rft.coden=ITPIDJ&rft_id=info:doi/10.1109/TPAMI.2022.3140249&rft_dat=%3Cproquest_pubme%3E2747610677%3C/proquest_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c351t-ab47fa93b0d8980625a1ab8d9a06050c478d9b1666a419f1d161b7273d34f1123%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2747610677&rft_id=info:pmid/34982677&rft_ieee_id=9669130&rfr_iscdi=true |