Loading…

Pixel integration from fine to coarse for lightweight image super-resolution

Recently, Transformer-based methods have made significant progress on image super-resolution. They encode long-range dependencies between image patches through self-attention mechanism. However, when extracting all tokens from the entire feature map, the computational cost is expensive. In this pape...

Full description

Saved in:

Bibliographic Details
Published in:	Image and vision computing 2025-02, Vol.154, Article 105362
Main Authors:	Wu, Yuxiang, Wang, Xiaoyan, Liu, Xiaoyan, Gao, Yuzhao, Dou, Yan
Format:	Article
Language:	English
Subjects:	Image super-resolution Lightweight Pixel integration Retractable attention Transformer
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Recently, Transformer-based methods have made significant progress on image super-resolution. They encode long-range dependencies between image patches through self-attention mechanism. However, when extracting all tokens from the entire feature map, the computational cost is expensive. In this paper, we propose a novel lightweight image super-resolution approach, pixel integration network(PIN). Specifically, our method employs fine pixel integration and coarse pixel integration from local and global receptive field. In particular, coarse pixel integration is implemented by a retractable attention, consisting of dense and sparse self-attention. In order to focus on enriching features with contextual information, spatial-gate mechanism and depth-wise convolution are introduced to multi-layer perception. Besides, a spatial frequency fusion block is adopted to obtain more comprehensive, detailed, and stable information at the end of deep feature extraction. Extensive experiments demonstrate that PIN achieves the state-of-the-art performance with small parameters on lightweight super-resolution. •PIN is proposed to encode pixel information from fine to coarse.•Retractable self-attention is used to interact among different receptive domains.•Spatial-gate mechanism is introduced to enrich features with contextual information.•Fast Fourier convolution is adopted to retain more high-frequency information.
ISSN:	0262-8856
DOI:	10.1016/j.imavis.2024.105362