Loading…

Effectiveness of Arbitrary Transfer Sets for Data-free Knowledge Distillation

Knowledge Distillation is an effective method to transfer the learning across deep neural networks. Typically, the dataset originally used for training the Teacher model is chosen as the "Transfer Set" to conduct the knowledge transfer to the Student. However, this original training data m...

Full description

Saved in:

Bibliographic Details
Published in:	arXiv.org 2020-11
Main Authors:	Nayak, Gaurav Kumar, Konda Reddy Mopuri, Chakraborty, Anirban
Format:	Article
Language:	English
Subjects:	Artificial neural networks Back propagation Back propagation networks Datasets Distillation Knowledge management Machine learning Optimization Random noise Training
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites
container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Nayak, Gaurav Kumar Konda Reddy Mopuri Chakraborty, Anirban
description	Knowledge Distillation is an effective method to transfer the learning across deep neural networks. Typically, the dataset originally used for training the Teacher model is chosen as the "Transfer Set" to conduct the knowledge transfer to the Student. However, this original training data may not always be freely available due to privacy or sensitivity concerns. In such scenarios, existing approaches either iteratively compose a synthetic set representative of the original training dataset, one sample at a time or learn a generative model to compose such a transfer set. However, both these approaches involve complex optimization (GAN training or several backpropagation steps to synthesize one sample) and are often computationally expensive. In this paper, as a simple alternative, we investigate the effectiveness of "arbitrary transfer sets" such as random noise, publicly available synthetic, and natural datasets, all of which are completely unrelated to the original training dataset in terms of their visual or semantic contents. Through extensive experiments on multiple benchmark datasets such as MNIST, FMNIST, CIFAR-10 and CIFAR-100, we discover and validate surprising effectiveness of using arbitrary data to conduct knowledge distillation when this dataset is "target-class balanced". We believe that this important observation can potentially lead to designing baselines for the data-free knowledge distillation task.
format	article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2462301133</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2462301133</sourcerecordid><originalsourceid>FETCH-proquest_journals_24623011333</originalsourceid><addsrcrecordid>eNqNyrEKwjAQgOEgCBbtOxw4F9Kkra5iK4I42b1EvUhKSfQuVXx7HXwAp3_4_olIlNZ5ti6UmomUuZdSqmqlylIn4thYi5fonuiRGYKFDZ1dJENvaMl4tkhwwshgA0FtosksIcLBh9eA1xtC7Ti6YTDRBb8QU2sGxvTXuVjumna7z-4UHiNy7Powkv9Sp4pKaZnnWuv_rg-c_D3X</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2462301133</pqid></control><display><type>article</type><title>Effectiveness of Arbitrary Transfer Sets for Data-free Knowledge Distillation</title><source>Publicly Available Content Database</source><creator>Nayak, Gaurav Kumar ; Konda Reddy Mopuri ; Chakraborty, Anirban</creator><creatorcontrib>Nayak, Gaurav Kumar ; Konda Reddy Mopuri ; Chakraborty, Anirban</creatorcontrib><description>Knowledge Distillation is an effective method to transfer the learning across deep neural networks. Typically, the dataset originally used for training the Teacher model is chosen as the "Transfer Set" to conduct the knowledge transfer to the Student. However, this original training data may not always be freely available due to privacy or sensitivity concerns. In such scenarios, existing approaches either iteratively compose a synthetic set representative of the original training dataset, one sample at a time or learn a generative model to compose such a transfer set. However, both these approaches involve complex optimization (GAN training or several backpropagation steps to synthesize one sample) and are often computationally expensive. In this paper, as a simple alternative, we investigate the effectiveness of "arbitrary transfer sets" such as random noise, publicly available synthetic, and natural datasets, all of which are completely unrelated to the original training dataset in terms of their visual or semantic contents. Through extensive experiments on multiple benchmark datasets such as MNIST, FMNIST, CIFAR-10 and CIFAR-100, we discover and validate surprising effectiveness of using arbitrary data to conduct knowledge distillation when this dataset is "target-class balanced". We believe that this important observation can potentially lead to designing baselines for the data-free knowledge distillation task.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Artificial neural networks ; Back propagation ; Back propagation networks ; Datasets ; Distillation ; Knowledge management ; Machine learning ; Optimization ; Random noise ; Training</subject><ispartof>arXiv.org, 2020-11</ispartof><rights>2020. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2462301133?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,37012,44590</link.rule.ids></links><search><creatorcontrib>Nayak, Gaurav Kumar</creatorcontrib><creatorcontrib>Konda Reddy Mopuri</creatorcontrib><creatorcontrib>Chakraborty, Anirban</creatorcontrib><title>Effectiveness of Arbitrary Transfer Sets for Data-free Knowledge Distillation</title><title>arXiv.org</title><description>Knowledge Distillation is an effective method to transfer the learning across deep neural networks. Typically, the dataset originally used for training the Teacher model is chosen as the "Transfer Set" to conduct the knowledge transfer to the Student. However, this original training data may not always be freely available due to privacy or sensitivity concerns. In such scenarios, existing approaches either iteratively compose a synthetic set representative of the original training dataset, one sample at a time or learn a generative model to compose such a transfer set. However, both these approaches involve complex optimization (GAN training or several backpropagation steps to synthesize one sample) and are often computationally expensive. In this paper, as a simple alternative, we investigate the effectiveness of "arbitrary transfer sets" such as random noise, publicly available synthetic, and natural datasets, all of which are completely unrelated to the original training dataset in terms of their visual or semantic contents. Through extensive experiments on multiple benchmark datasets such as MNIST, FMNIST, CIFAR-10 and CIFAR-100, we discover and validate surprising effectiveness of using arbitrary data to conduct knowledge distillation when this dataset is "target-class balanced". We believe that this important observation can potentially lead to designing baselines for the data-free knowledge distillation task.</description><subject>Artificial neural networks</subject><subject>Back propagation</subject><subject>Back propagation networks</subject><subject>Datasets</subject><subject>Distillation</subject><subject>Knowledge management</subject><subject>Machine learning</subject><subject>Optimization</subject><subject>Random noise</subject><subject>Training</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNyrEKwjAQgOEgCBbtOxw4F9Kkra5iK4I42b1EvUhKSfQuVXx7HXwAp3_4_olIlNZ5ti6UmomUuZdSqmqlylIn4thYi5fonuiRGYKFDZ1dJENvaMl4tkhwwshgA0FtosksIcLBh9eA1xtC7Ti6YTDRBb8QU2sGxvTXuVjumna7z-4UHiNy7Powkv9Sp4pKaZnnWuv_rg-c_D3X</recordid><startdate>20201118</startdate><enddate>20201118</enddate><creator>Nayak, Gaurav Kumar</creator><creator>Konda Reddy Mopuri</creator><creator>Chakraborty, Anirban</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20201118</creationdate><title>Effectiveness of Arbitrary Transfer Sets for Data-free Knowledge Distillation</title><author>Nayak, Gaurav Kumar ; Konda Reddy Mopuri ; Chakraborty, Anirban</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_24623011333</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Artificial neural networks</topic><topic>Back propagation</topic><topic>Back propagation networks</topic><topic>Datasets</topic><topic>Distillation</topic><topic>Knowledge management</topic><topic>Machine learning</topic><topic>Optimization</topic><topic>Random noise</topic><topic>Training</topic><toplevel>online_resources</toplevel><creatorcontrib>Nayak, Gaurav Kumar</creatorcontrib><creatorcontrib>Konda Reddy Mopuri</creatorcontrib><creatorcontrib>Chakraborty, Anirban</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Nayak, Gaurav Kumar</au><au>Konda Reddy Mopuri</au><au>Chakraborty, Anirban</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Effectiveness of Arbitrary Transfer Sets for Data-free Knowledge Distillation</atitle><jtitle>arXiv.org</jtitle><date>2020-11-18</date><risdate>2020</risdate><eissn>2331-8422</eissn><abstract>Knowledge Distillation is an effective method to transfer the learning across deep neural networks. Typically, the dataset originally used for training the Teacher model is chosen as the "Transfer Set" to conduct the knowledge transfer to the Student. However, this original training data may not always be freely available due to privacy or sensitivity concerns. In such scenarios, existing approaches either iteratively compose a synthetic set representative of the original training dataset, one sample at a time or learn a generative model to compose such a transfer set. However, both these approaches involve complex optimization (GAN training or several backpropagation steps to synthesize one sample) and are often computationally expensive. In this paper, as a simple alternative, we investigate the effectiveness of "arbitrary transfer sets" such as random noise, publicly available synthetic, and natural datasets, all of which are completely unrelated to the original training dataset in terms of their visual or semantic contents. Through extensive experiments on multiple benchmark datasets such as MNIST, FMNIST, CIFAR-10 and CIFAR-100, we discover and validate surprising effectiveness of using arbitrary data to conduct knowledge distillation when this dataset is "target-class balanced". We believe that this important observation can potentially lead to designing baselines for the data-free knowledge distillation task.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2020-11
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2462301133
source	Publicly Available Content Database
subjects	Artificial neural networks Back propagation Back propagation networks Datasets Distillation Knowledge management Machine learning Optimization Random noise Training
title	Effectiveness of Arbitrary Transfer Sets for Data-free Knowledge Distillation
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T18%3A25%3A14IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Effectiveness%20of%20Arbitrary%20Transfer%20Sets%20for%20Data-free%20Knowledge%20Distillation&rft.jtitle=arXiv.org&rft.au=Nayak,%20Gaurav%20Kumar&rft.date=2020-11-18&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2462301133%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_24623011333%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2462301133&rft_id=info:pmid/&rfr_iscdi=true