Loading…

Hybrid knowledge distillation from intermediate layers for efficient Single Image Super-Resolution

Convolutional and Transformer models have achieved remarkable results for Single Image Super-Resolution (SISR). However, the tremendous memory and computation consumption of these models restricts their usage in resource-limited scenarios. Knowledge distillation, as an effective model compression te...

Full description

Saved in:

Bibliographic Details
Published in:	Neurocomputing (Amsterdam) 2023-10, Vol.554, p.126592, Article 126592
Main Authors:	Xie, Jiao, Gong, Linrui, Shao, Shitong, Lin, Shaohui, Luo, Linkai
Format:	Article
Language:	English
Subjects:	Discrete wavelet transformation Image super-resolution Knowledge distillation Model compression
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Convolutional and Transformer models have achieved remarkable results for Single Image Super-Resolution (SISR). However, the tremendous memory and computation consumption of these models restricts their usage in resource-limited scenarios. Knowledge distillation, as an effective model compression technique, has received great research focus on the SISR task. In this paper, we propose a novel efficient SISR method via hybrid knowledge distillation from intermediate layers, termed HKDSR, which leverages the knowledge from frequency information into that RGB information. To accomplish this, we first pre-train the teacher with multiple intermediate upsampling layers to generate the intermediate SR outputs. We then construct two kinds of intermediate knowledge from the Frequency Similarity Matrix (FSM) and Adaptive Channel Fusion (ACF). FSM aims to mine the relationship of frequency similarity between the Ground-truth (GT) HR image, and the intermediate SR outputs of teacher and student by Discrete Wavelet Transformation. ACF merges the intermediate SR output of the teacher and GT HR image in a channel dimension to adaptively align the intermediate SR output of the student. Finally, we leverage the knowledge from FSM and ACF into reconstruction loss to effectively improve student performance. Extensive experiments demonstrate the effectiveness of HKDSR on different benchmark datasets and network architectures. •To the best of our knowledge, HKDSR is the first to propose combining spatial and frequency information as complementary knowledge for efficient SISR. Furthermore, the complementary knowledge overcomes the problems of the mismatched dimension or information to adapt to the distillation of different network architectures.•Instead of direct alignment on the frequency and RGB space, we bridge the frequency and spatial information of the HR image with those of the teacher SR image in the intermediate layers, enabling SFM and ACF to transfer rich texture and edge information from both HR and the teacher to learn the student.•Comprehensive experiments show the effectiveness of our proposed method in both quantitative and visualization results. On Urban100, the compressed student RCAN model by our proposed HKDSR achieves 32.85 dB PSNR, outperforming baseline and state-of-the-art methods.
ISSN:	0925-2312 1872-8286
DOI:	10.1016/j.neucom.2023.126592