Loading…

Weighted Distillation with Unlabeled Examples

Distillation with unlabeled examples is a popular and powerful method for training deep neural networks in settings where the amount of labeled data is limited: A large ''teacher'' neural network is trained on the labeled data available, and then it is used to generate labels on...

Full description

Saved in:

Bibliographic Details
Published in:	arXiv.org 2022-10
Main Authors:	Iliopoulos, Fotis, Kontonis, Vasilis, Baykal, Cenk, Menghani, Gaurav, Trinh, Khoa, Vee, Erik
Format:	Article
Language:	English
Subjects:	Artificial neural networks Datasets Distillation Labels Neural networks Teachers Training
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites
container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Iliopoulos, Fotis Kontonis, Vasilis Baykal, Cenk Menghani, Gaurav Trinh, Khoa Vee, Erik
description	Distillation with unlabeled examples is a popular and powerful method for training deep neural networks in settings where the amount of labeled data is limited: A large ''teacher'' neural network is trained on the labeled data available, and then it is used to generate labels on an unlabeled dataset (typically much larger in size). These labels are then utilized to train the smaller ''student'' model which will actually be deployed. Naturally, the success of the approach depends on the quality of the teacher's labels, since the student could be confused if trained on inaccurate data. This paper proposes a principled approach for addressing this issue based on a ''debiasing'' reweighting of the student's loss function tailored to the distillation training paradigm. Our method is hyper-parameter free, data-agnostic, and simple to implement. We demonstrate significant improvements on popular academic datasets and we accompany our results with a theoretical analysis which rigorously justifies the performance of our method in certain settings.
format	article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2724768679</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2724768679</sourcerecordid><originalsourceid>FETCH-proquest_journals_27247686793</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mTQDU_NTM8oSU1RcMksLsnMyUksyczPUyjPLMlQCM3LSUxKzQHKuVYk5hbkpBbzMLCmJeYUp_JCaW4GZTfXEGcP3YKi_MLS1OKS-Kz80qI8oFS8kbmRibmZhZm5pTFxqgDYbzJO</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2724768679</pqid></control><display><type>article</type><title>Weighted Distillation with Unlabeled Examples</title><source>Publicly Available Content Database</source><creator>Iliopoulos, Fotis ; Kontonis, Vasilis ; Baykal, Cenk ; Menghani, Gaurav ; Trinh, Khoa ; Vee, Erik</creator><creatorcontrib>Iliopoulos, Fotis ; Kontonis, Vasilis ; Baykal, Cenk ; Menghani, Gaurav ; Trinh, Khoa ; Vee, Erik</creatorcontrib><description>Distillation with unlabeled examples is a popular and powerful method for training deep neural networks in settings where the amount of labeled data is limited: A large ''teacher'' neural network is trained on the labeled data available, and then it is used to generate labels on an unlabeled dataset (typically much larger in size). These labels are then utilized to train the smaller ''student'' model which will actually be deployed. Naturally, the success of the approach depends on the quality of the teacher's labels, since the student could be confused if trained on inaccurate data. This paper proposes a principled approach for addressing this issue based on a ''debiasing'' reweighting of the student's loss function tailored to the distillation training paradigm. Our method is hyper-parameter free, data-agnostic, and simple to implement. We demonstrate significant improvements on popular academic datasets and we accompany our results with a theoretical analysis which rigorously justifies the performance of our method in certain settings.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Artificial neural networks ; Datasets ; Distillation ; Labels ; Neural networks ; Teachers ; Training</subject><ispartof>arXiv.org, 2022-10</ispartof><rights>2022. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2724768679?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,37012,44590</link.rule.ids></links><search><creatorcontrib>Iliopoulos, Fotis</creatorcontrib><creatorcontrib>Kontonis, Vasilis</creatorcontrib><creatorcontrib>Baykal, Cenk</creatorcontrib><creatorcontrib>Menghani, Gaurav</creatorcontrib><creatorcontrib>Trinh, Khoa</creatorcontrib><creatorcontrib>Vee, Erik</creatorcontrib><title>Weighted Distillation with Unlabeled Examples</title><title>arXiv.org</title><description>Distillation with unlabeled examples is a popular and powerful method for training deep neural networks in settings where the amount of labeled data is limited: A large ''teacher'' neural network is trained on the labeled data available, and then it is used to generate labels on an unlabeled dataset (typically much larger in size). These labels are then utilized to train the smaller ''student'' model which will actually be deployed. Naturally, the success of the approach depends on the quality of the teacher's labels, since the student could be confused if trained on inaccurate data. This paper proposes a principled approach for addressing this issue based on a ''debiasing'' reweighting of the student's loss function tailored to the distillation training paradigm. Our method is hyper-parameter free, data-agnostic, and simple to implement. We demonstrate significant improvements on popular academic datasets and we accompany our results with a theoretical analysis which rigorously justifies the performance of our method in certain settings.</description><subject>Artificial neural networks</subject><subject>Datasets</subject><subject>Distillation</subject><subject>Labels</subject><subject>Neural networks</subject><subject>Teachers</subject><subject>Training</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mTQDU_NTM8oSU1RcMksLsnMyUksyczPUyjPLMlQCM3LSUxKzQHKuVYk5hbkpBbzMLCmJeYUp_JCaW4GZTfXEGcP3YKi_MLS1OKS-Kz80qI8oFS8kbmRibmZhZm5pTFxqgDYbzJO</recordid><startdate>20221013</startdate><enddate>20221013</enddate><creator>Iliopoulos, Fotis</creator><creator>Kontonis, Vasilis</creator><creator>Baykal, Cenk</creator><creator>Menghani, Gaurav</creator><creator>Trinh, Khoa</creator><creator>Vee, Erik</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20221013</creationdate><title>Weighted Distillation with Unlabeled Examples</title><author>Iliopoulos, Fotis ; Kontonis, Vasilis ; Baykal, Cenk ; Menghani, Gaurav ; Trinh, Khoa ; Vee, Erik</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_27247686793</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Artificial neural networks</topic><topic>Datasets</topic><topic>Distillation</topic><topic>Labels</topic><topic>Neural networks</topic><topic>Teachers</topic><topic>Training</topic><toplevel>online_resources</toplevel><creatorcontrib>Iliopoulos, Fotis</creatorcontrib><creatorcontrib>Kontonis, Vasilis</creatorcontrib><creatorcontrib>Baykal, Cenk</creatorcontrib><creatorcontrib>Menghani, Gaurav</creatorcontrib><creatorcontrib>Trinh, Khoa</creatorcontrib><creatorcontrib>Vee, Erik</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Iliopoulos, Fotis</au><au>Kontonis, Vasilis</au><au>Baykal, Cenk</au><au>Menghani, Gaurav</au><au>Trinh, Khoa</au><au>Vee, Erik</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Weighted Distillation with Unlabeled Examples</atitle><jtitle>arXiv.org</jtitle><date>2022-10-13</date><risdate>2022</risdate><eissn>2331-8422</eissn><abstract>Distillation with unlabeled examples is a popular and powerful method for training deep neural networks in settings where the amount of labeled data is limited: A large ''teacher'' neural network is trained on the labeled data available, and then it is used to generate labels on an unlabeled dataset (typically much larger in size). These labels are then utilized to train the smaller ''student'' model which will actually be deployed. Naturally, the success of the approach depends on the quality of the teacher's labels, since the student could be confused if trained on inaccurate data. This paper proposes a principled approach for addressing this issue based on a ''debiasing'' reweighting of the student's loss function tailored to the distillation training paradigm. Our method is hyper-parameter free, data-agnostic, and simple to implement. We demonstrate significant improvements on popular academic datasets and we accompany our results with a theoretical analysis which rigorously justifies the performance of our method in certain settings.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2022-10
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2724768679
source	Publicly Available Content Database
subjects	Artificial neural networks Datasets Distillation Labels Neural networks Teachers Training
title	Weighted Distillation with Unlabeled Examples
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-02T16%3A46%3A04IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Weighted%20Distillation%20with%20Unlabeled%20Examples&rft.jtitle=arXiv.org&rft.au=Iliopoulos,%20Fotis&rft.date=2022-10-13&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2724768679%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_27247686793%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2724768679&rft_id=info:pmid/&rfr_iscdi=true