Loading…
Dual path features interaction network for efficient image super-resolution
Image super-resolution (SR) is a crucial task in computer vision that involves reconstructing a low-resolution (LR) image into its high-resolution (HR) counterpart. Transformer-based methods excel at establishing long-range dependency but face challenges with high-complexity computations and capturi...
Saved in:
Published in: | Neurocomputing (Amsterdam) 2024-10, Vol.601, p.128226, Article 128226 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Image super-resolution (SR) is a crucial task in computer vision that involves reconstructing a low-resolution (LR) image into its high-resolution (HR) counterpart. Transformer-based methods excel at establishing long-range dependency but face challenges with high-complexity computations and capturing fine-grained features. Conversely, CNN-based methods are advantageous in capturing fine-grained features but are limited by fixed receptive fields. Additionally, most SR methods suffer from channel redundancy, leading to higher computational overhead. In this paper, we propose a Dual Path Features Interaction Network (DPFINet) to achieve efficient image SR, which consists of two components: a) To alleviate the issue of feature channel redundancy, a Local–Global Features Modeling (LGFM) method is newly proposed, which concurrently models global and local features by splitting features along different channels. In LGFM, a Shift Window Linear Attention (SWLA) layer is adopted to effectively capture global information through a large shift window based on the split features. Meanwhile, a Multi-Scale Detail Enhancement (MSDE) layer is designed, where the split other features are encoded to facilitate detail reconstruction through an interactive fusion of semantic and local information, thereby addressing the limitations of SWLA in capturing fine-grained features. b) A Cross-Level Features Interaction (CLFI) method is proposed to fuse global and local features modeled by different network structures (SWLA and MSDE), where a novel residual fusion mechanism is designed to preserve both global and local information while complementing each other. Extensive experiments demonstrate that our method outperforms most state-of-the-art SR methods on five benchmark datasets. Notably, during inference, our approach improves performance by 0.41 dB and reduces memory consumption by approximately 79% compared to DiVANet (Behjati et al., 2023) on the Manga109 (×4) dataset. |
---|---|
ISSN: | 0925-2312 |
DOI: | 10.1016/j.neucom.2024.128226 |