Loading…

On Implicit Bias in Overparameterized Bilevel Optimization

Many problems in machine learning involve bilevel optimization (BLO), including hyperparameter optimization, meta-learning, and dataset distillation. Bilevel problems consist of two nested sub-problems, called the outer and inner problems, respectively. In practice, often at least one of these sub-p...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2022-12
Main Authors: Vicol, Paul, Lorraine, Jonathan, Pedregosa, Fabian, Duvenaud, David, Grosse, Roger
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Vicol, Paul
Lorraine, Jonathan
Pedregosa, Fabian
Duvenaud, David
Grosse, Roger
description Many problems in machine learning involve bilevel optimization (BLO), including hyperparameter optimization, meta-learning, and dataset distillation. Bilevel problems consist of two nested sub-problems, called the outer and inner problems, respectively. In practice, often at least one of these sub-problems is overparameterized. In this case, there are many ways to choose among optima that achieve equivalent objective values. Inspired by recent studies of the implicit bias induced by optimization algorithms in single-level optimization, we investigate the implicit bias of gradient-based algorithms for bilevel optimization. We delineate two standard BLO methods -- cold-start and warm-start -- and show that the converged solution or long-run behavior depends to a large degree on these and other algorithmic choices, such as the hypergradient approximation. We also show that the inner solutions obtained by warm-start BLO can encode a surprising amount of information about the outer objective, even when the outer parameters are low-dimensional. We believe that implicit bias deserves as central a role in the study of bilevel optimization as it has attained in the study of single-level neural net optimization.
format article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2759127032</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2759127032</sourcerecordid><originalsourceid>FETCH-proquest_journals_27591270323</originalsourceid><addsrcrecordid>eNqNyrEKwjAUQNEgCBbtPwScC-mLteqoKDplcS-hPuGVJI1J2qFfbwc_wOkO5y5YBlKWxWEHsGJ5jJ0QAvY1VJXM2Ek5_rDeUEuJn0lHTo6rEYPXQVtMGGjC1ywGRzRc-USWJp2odxu2fGsTMf91zba36_NyL3zoPwPG1HT9ENxMDdTVsYRaSJD_XV-8mTbs</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2759127032</pqid></control><display><type>article</type><title>On Implicit Bias in Overparameterized Bilevel Optimization</title><source>Publicly Available Content Database</source><creator>Vicol, Paul ; Lorraine, Jonathan ; Pedregosa, Fabian ; Duvenaud, David ; Grosse, Roger</creator><creatorcontrib>Vicol, Paul ; Lorraine, Jonathan ; Pedregosa, Fabian ; Duvenaud, David ; Grosse, Roger</creatorcontrib><description>Many problems in machine learning involve bilevel optimization (BLO), including hyperparameter optimization, meta-learning, and dataset distillation. Bilevel problems consist of two nested sub-problems, called the outer and inner problems, respectively. In practice, often at least one of these sub-problems is overparameterized. In this case, there are many ways to choose among optima that achieve equivalent objective values. Inspired by recent studies of the implicit bias induced by optimization algorithms in single-level optimization, we investigate the implicit bias of gradient-based algorithms for bilevel optimization. We delineate two standard BLO methods -- cold-start and warm-start -- and show that the converged solution or long-run behavior depends to a large degree on these and other algorithmic choices, such as the hypergradient approximation. We also show that the inner solutions obtained by warm-start BLO can encode a surprising amount of information about the outer objective, even when the outer parameters are low-dimensional. We believe that implicit bias deserves as central a role in the study of bilevel optimization as it has attained in the study of single-level neural net optimization.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Algorithms ; Bias ; Cold starts ; Distillation ; Machine learning ; Optimization</subject><ispartof>arXiv.org, 2022-12</ispartof><rights>2022. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2759127032?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>776,780,25731,36989,44566</link.rule.ids></links><search><creatorcontrib>Vicol, Paul</creatorcontrib><creatorcontrib>Lorraine, Jonathan</creatorcontrib><creatorcontrib>Pedregosa, Fabian</creatorcontrib><creatorcontrib>Duvenaud, David</creatorcontrib><creatorcontrib>Grosse, Roger</creatorcontrib><title>On Implicit Bias in Overparameterized Bilevel Optimization</title><title>arXiv.org</title><description>Many problems in machine learning involve bilevel optimization (BLO), including hyperparameter optimization, meta-learning, and dataset distillation. Bilevel problems consist of two nested sub-problems, called the outer and inner problems, respectively. In practice, often at least one of these sub-problems is overparameterized. In this case, there are many ways to choose among optima that achieve equivalent objective values. Inspired by recent studies of the implicit bias induced by optimization algorithms in single-level optimization, we investigate the implicit bias of gradient-based algorithms for bilevel optimization. We delineate two standard BLO methods -- cold-start and warm-start -- and show that the converged solution or long-run behavior depends to a large degree on these and other algorithmic choices, such as the hypergradient approximation. We also show that the inner solutions obtained by warm-start BLO can encode a surprising amount of information about the outer objective, even when the outer parameters are low-dimensional. We believe that implicit bias deserves as central a role in the study of bilevel optimization as it has attained in the study of single-level neural net optimization.</description><subject>Algorithms</subject><subject>Bias</subject><subject>Cold starts</subject><subject>Distillation</subject><subject>Machine learning</subject><subject>Optimization</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNyrEKwjAUQNEgCBbtPwScC-mLteqoKDplcS-hPuGVJI1J2qFfbwc_wOkO5y5YBlKWxWEHsGJ5jJ0QAvY1VJXM2Ek5_rDeUEuJn0lHTo6rEYPXQVtMGGjC1ywGRzRc-USWJp2odxu2fGsTMf91zba36_NyL3zoPwPG1HT9ENxMDdTVsYRaSJD_XV-8mTbs</recordid><startdate>20221228</startdate><enddate>20221228</enddate><creator>Vicol, Paul</creator><creator>Lorraine, Jonathan</creator><creator>Pedregosa, Fabian</creator><creator>Duvenaud, David</creator><creator>Grosse, Roger</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20221228</creationdate><title>On Implicit Bias in Overparameterized Bilevel Optimization</title><author>Vicol, Paul ; Lorraine, Jonathan ; Pedregosa, Fabian ; Duvenaud, David ; Grosse, Roger</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_27591270323</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Algorithms</topic><topic>Bias</topic><topic>Cold starts</topic><topic>Distillation</topic><topic>Machine learning</topic><topic>Optimization</topic><toplevel>online_resources</toplevel><creatorcontrib>Vicol, Paul</creatorcontrib><creatorcontrib>Lorraine, Jonathan</creatorcontrib><creatorcontrib>Pedregosa, Fabian</creatorcontrib><creatorcontrib>Duvenaud, David</creatorcontrib><creatorcontrib>Grosse, Roger</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Vicol, Paul</au><au>Lorraine, Jonathan</au><au>Pedregosa, Fabian</au><au>Duvenaud, David</au><au>Grosse, Roger</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>On Implicit Bias in Overparameterized Bilevel Optimization</atitle><jtitle>arXiv.org</jtitle><date>2022-12-28</date><risdate>2022</risdate><eissn>2331-8422</eissn><abstract>Many problems in machine learning involve bilevel optimization (BLO), including hyperparameter optimization, meta-learning, and dataset distillation. Bilevel problems consist of two nested sub-problems, called the outer and inner problems, respectively. In practice, often at least one of these sub-problems is overparameterized. In this case, there are many ways to choose among optima that achieve equivalent objective values. Inspired by recent studies of the implicit bias induced by optimization algorithms in single-level optimization, we investigate the implicit bias of gradient-based algorithms for bilevel optimization. We delineate two standard BLO methods -- cold-start and warm-start -- and show that the converged solution or long-run behavior depends to a large degree on these and other algorithmic choices, such as the hypergradient approximation. We also show that the inner solutions obtained by warm-start BLO can encode a surprising amount of information about the outer objective, even when the outer parameters are low-dimensional. We believe that implicit bias deserves as central a role in the study of bilevel optimization as it has attained in the study of single-level neural net optimization.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2022-12
issn 2331-8422
language eng
recordid cdi_proquest_journals_2759127032
source Publicly Available Content Database
subjects Algorithms
Bias
Cold starts
Distillation
Machine learning
Optimization
title On Implicit Bias in Overparameterized Bilevel Optimization
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-03T10%3A51%3A15IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=On%20Implicit%20Bias%20in%20Overparameterized%20Bilevel%20Optimization&rft.jtitle=arXiv.org&rft.au=Vicol,%20Paul&rft.date=2022-12-28&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2759127032%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_27591270323%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2759127032&rft_id=info:pmid/&rfr_iscdi=true