Loading…

On Implicit Bias in Overparameterized Bilevel Optimization

Many problems in machine learning involve bilevel optimization (BLO), including hyperparameter optimization, meta-learning, and dataset distillation. Bilevel problems consist of two nested sub-problems, called the outer and inner problems, respectively. In practice, often at least one of these sub-p...

Full description

Saved in:

Bibliographic Details
Published in:	arXiv.org 2022-12
Main Authors:	Vicol, Paul, Lorraine, Jonathan, Pedregosa, Fabian, Duvenaud, David, Grosse, Roger
Format:	Article
Language:	English
Subjects:	Algorithms Bias Cold starts Distillation Machine learning Optimization
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites
container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Vicol, Paul Lorraine, Jonathan Pedregosa, Fabian Duvenaud, David Grosse, Roger
description	Many problems in machine learning involve bilevel optimization (BLO), including hyperparameter optimization, meta-learning, and dataset distillation. Bilevel problems consist of two nested sub-problems, called the outer and inner problems, respectively. In practice, often at least one of these sub-problems is overparameterized. In this case, there are many ways to choose among optima that achieve equivalent objective values. Inspired by recent studies of the implicit bias induced by optimization algorithms in single-level optimization, we investigate the implicit bias of gradient-based algorithms for bilevel optimization. We delineate two standard BLO methods -- cold-start and warm-start -- and show that the converged solution or long-run behavior depends to a large degree on these and other algorithmic choices, such as the hypergradient approximation. We also show that the inner solutions obtained by warm-start BLO can encode a surprising amount of information about the outer objective, even when the outer parameters are low-dimensional. We believe that implicit bias deserves as central a role in the study of bilevel optimization as it has attained in the study of single-level neural net optimization.
format	article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2759127032</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2759127032</sourcerecordid><originalsourceid>FETCH-proquest_journals_27591270323</originalsourceid><addsrcrecordid>eNqNyrEKwjAUQNEgCBbtPwScC-mLteqoKDplcS-hPuGVJI1J2qFfbwc_wOkO5y5YBlKWxWEHsGJ5jJ0QAvY1VJXM2Ek5_rDeUEuJn0lHTo6rEYPXQVtMGGjC1ywGRzRc-USWJp2odxu2fGsTMf91zba36_NyL3zoPwPG1HT9ENxMDdTVsYRaSJD_XV-8mTbs</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2759127032</pqid></control><display><type>article</type><title>On Implicit Bias in Overparameterized Bilevel Optimization</title><source>Publicly Available Content Database</source><creator>Vicol, Paul ; Lorraine, Jonathan ; Pedregosa, Fabian ; Duvenaud, David ; Grosse, Roger</creator><creatorcontrib>Vicol, Paul ; Lorraine, Jonathan ; Pedregosa, Fabian ; Duvenaud, David ; Grosse, Roger</creatorcontrib><description>Many problems in machine learning involve bilevel optimization (BLO), including hyperparameter optimization, meta-learning, and dataset distillation. Bilevel problems consist of two nested sub-problems, called the outer and inner problems, respectively. In practice, often at least one of these sub-problems is overparameterized. In this case, there are many ways to choose among optima that achieve equivalent objective values. Inspired by recent studies of the implicit bias induced by optimization algorithms in single-level optimization, we investigate the implicit bias of gradient-based algorithms for bilevel optimization. We delineate two standard BLO methods -- cold-start and warm-start -- and show that the converged solution or long-run behavior depends to a large degree on these and other algorithmic choices, such as the hypergradient approximation. We also show that the inner solutions obtained by warm-start BLO can encode a surprising amount of information about the outer objective, even when the outer parameters are low-dimensional. We believe that implicit bias deserves as central a role in the study of bilevel optimization as it has attained in the study of single-level neural net optimization.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Algorithms ; Bias ; Cold starts ; Distillation ; Machine learning ; Optimization</subject><ispartof>arXiv.org, 2022-12</ispartof><rights>2022. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2759127032?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>776,780,25731,36989,44566</link.rule.ids></links><search><creatorcontrib>Vicol, Paul</creatorcontrib><creatorcontrib>Lorraine, Jonathan</creatorcontrib><creatorcontrib>Pedregosa, Fabian</creatorcontrib><creatorcontrib>Duvenaud, David</creatorcontrib><creatorcontrib>Grosse, Roger</creatorcontrib><title>On Implicit Bias in Overparameterized Bilevel Optimization</title><title>arXiv.org</title><description>Many problems in machine learning involve bilevel optimization (BLO), including hyperparameter optimization, meta-learning, and dataset distillation. Bilevel problems consist of two nested sub-problems, called the outer and inner problems, respectively. In practice, often at least one of these sub-problems is overparameterized. In this case, there are many ways to choose among optima that achieve equivalent objective values. Inspired by recent studies of the implicit bias induced by optimization algorithms in single-level optimization, we investigate the implicit bias of gradient-based algorithms for bilevel optimization. We delineate two standard BLO methods -- cold-start and warm-start -- and show that the converged solution or long-run behavior depends to a large degree on these and other algorithmic choices, such as the hypergradient approximation. We also show that the inner solutions obtained by warm-start BLO can encode a surprising amount of information about the outer objective, even when the outer parameters are low-dimensional. We believe that implicit bias deserves as central a role in the study of bilevel optimization as it has attained in the study of single-level neural net optimization.</description><subject>Algorithms</subject><subject>Bias</subject><subject>Cold starts</subject><subject>Distillation</subject><subject>Machine learning</subject><subject>Optimization</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNyrEKwjAUQNEgCBbtPwScC-mLteqoKDplcS-hPuGVJI1J2qFfbwc_wOkO5y5YBlKWxWEHsGJ5jJ0QAvY1VJXM2Ek5_rDeUEuJn0lHTo6rEYPXQVtMGGjC1ywGRzRc-USWJp2odxu2fGsTMf91zba36_NyL3zoPwPG1HT9ENxMDdTVsYRaSJD_XV-8mTbs</recordid><startdate>20221228</startdate><enddate>20221228</enddate><creator>Vicol, Paul</creator><creator>Lorraine, Jonathan</creator><creator>Pedregosa, Fabian</creator><creator>Duvenaud, David</creator><creator>Grosse, Roger</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20221228</creationdate><title>On Implicit Bias in Overparameterized Bilevel Optimization</title><author>Vicol, Paul ; Lorraine, Jonathan ; Pedregosa, Fabian ; Duvenaud, David ; Grosse, Roger</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_27591270323</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Algorithms</topic><topic>Bias</topic><topic>Cold starts</topic><topic>Distillation</topic><topic>Machine learning</topic><topic>Optimization</topic><toplevel>online_resources</toplevel><creatorcontrib>Vicol, Paul</creatorcontrib><creatorcontrib>Lorraine, Jonathan</creatorcontrib><creatorcontrib>Pedregosa, Fabian</creatorcontrib><creatorcontrib>Duvenaud, David</creatorcontrib><creatorcontrib>Grosse, Roger</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Vicol, Paul</au><au>Lorraine, Jonathan</au><au>Pedregosa, Fabian</au><au>Duvenaud, David</au><au>Grosse, Roger</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>On Implicit Bias in Overparameterized Bilevel Optimization</atitle><jtitle>arXiv.org</jtitle><date>2022-12-28</date><risdate>2022</risdate><eissn>2331-8422</eissn><abstract>Many problems in machine learning involve bilevel optimization (BLO), including hyperparameter optimization, meta-learning, and dataset distillation. Bilevel problems consist of two nested sub-problems, called the outer and inner problems, respectively. In practice, often at least one of these sub-problems is overparameterized. In this case, there are many ways to choose among optima that achieve equivalent objective values. Inspired by recent studies of the implicit bias induced by optimization algorithms in single-level optimization, we investigate the implicit bias of gradient-based algorithms for bilevel optimization. We delineate two standard BLO methods -- cold-start and warm-start -- and show that the converged solution or long-run behavior depends to a large degree on these and other algorithmic choices, such as the hypergradient approximation. We also show that the inner solutions obtained by warm-start BLO can encode a surprising amount of information about the outer objective, even when the outer parameters are low-dimensional. We believe that implicit bias deserves as central a role in the study of bilevel optimization as it has attained in the study of single-level neural net optimization.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2022-12
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2759127032
source	Publicly Available Content Database
subjects	Algorithms Bias Cold starts Distillation Machine learning Optimization
title	On Implicit Bias in Overparameterized Bilevel Optimization
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-03T10%3A51%3A15IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=On%20Implicit%20Bias%20in%20Overparameterized%20Bilevel%20Optimization&rft.jtitle=arXiv.org&rft.au=Vicol,%20Paul&rft.date=2022-12-28&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2759127032%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_27591270323%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2759127032&rft_id=info:pmid/&rfr_iscdi=true