Loading…

Spirit Distillation: A Model Compression Method with Multi-domain Knowledge Transfer

Recent applications pose requirements of both cross-domain knowledge transfer and model compression to machine learning models due to insufficient training data and limited computational resources. In this paper, we propose a new knowledge distillation model, named Spirit Distillation (SD), which is...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2021-04
Main Authors: Wu, Zhiyuan, Jiang, Yu, Zhao, Minghao, Cui, Chupeng, Yang, Zongmin, Xue, Xinhui, Hong, Qi
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Recent applications pose requirements of both cross-domain knowledge transfer and model compression to machine learning models due to insufficient training data and limited computational resources. In this paper, we propose a new knowledge distillation model, named Spirit Distillation (SD), which is a model compression method with multi-domain knowledge transfer. The compact student network mimics out a representation equivalent to the front part of the teacher network, through which the general knowledge can be transferred from the source domain (teacher) to the target domain (student). To further improve the robustness of the student, we extend SD to Enhanced Spirit Distillation (ESD) in exploiting a more comprehensive knowledge by introducing the proximity domain which is similar to the target domain for feature extraction. Results demonstrate that our method can boost mIOU and high-precision accuracy by 1.4% and 8.2% respectively with 78.2% segmentation variance, and can gain a precise compact network with only 41.8% FLOPs.
ISSN:2331-8422