Loading…

Pixel integration from fine to coarse for lightweight image super-resolution

Recently, Transformer-based methods have made significant progress on image super-resolution. They encode long-range dependencies between image patches through self-attention mechanism. However, when extracting all tokens from the entire feature map, the computational cost is expensive. In this pape...

Full description

Saved in:
Bibliographic Details
Published in:Image and vision computing 2025-02, Vol.154, Article 105362
Main Authors: Wu, Yuxiang, Wang, Xiaoyan, Liu, Xiaoyan, Gao, Yuzhao, Dou, Yan
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Recently, Transformer-based methods have made significant progress on image super-resolution. They encode long-range dependencies between image patches through self-attention mechanism. However, when extracting all tokens from the entire feature map, the computational cost is expensive. In this paper, we propose a novel lightweight image super-resolution approach, pixel integration network(PIN). Specifically, our method employs fine pixel integration and coarse pixel integration from local and global receptive field. In particular, coarse pixel integration is implemented by a retractable attention, consisting of dense and sparse self-attention. In order to focus on enriching features with contextual information, spatial-gate mechanism and depth-wise convolution are introduced to multi-layer perception. Besides, a spatial frequency fusion block is adopted to obtain more comprehensive, detailed, and stable information at the end of deep feature extraction. Extensive experiments demonstrate that PIN achieves the state-of-the-art performance with small parameters on lightweight super-resolution. •PIN is proposed to encode pixel information from fine to coarse.•Retractable self-attention is used to interact among different receptive domains.•Spatial-gate mechanism is introduced to enrich features with contextual information.•Fast Fourier convolution is adopted to retain more high-frequency information.
ISSN:0262-8856
DOI:10.1016/j.imavis.2024.105362