Loading…
Mutually guided learning of global semantics and local representations for image restoration
The global semantics and the local scene representation are crucial for image restoration. Although existing methods have proposed various hybrid frameworks of convolutional neural networks (CNNs) and Transformers to take into account both, they only focus on the complementarity of their capabilitie...
Saved in:
Published in: | Multimedia tools and applications 2024-03, Vol.83 (10), p.30019-30044 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The global semantics and the local scene representation are crucial for image restoration. Although existing methods have proposed various hybrid frameworks of convolutional neural networks (CNNs) and Transformers to take into account both, they only focus on the complementarity of their capabilities. On the one hand, these works neglect the mutual guiding role of the two information, and on the other hand, they also ignore that the semantic gap caused by the two different modeling systems of convolution and Self-Attention seriously impede the feature fusion. In this work, we propose to establish entanglement between the global and the local to bridge the semantic gap and achieve mutual-guided modeling of the two features. In the proposed hybrid framework, the modeling of convolution and Self-Attention is no longer independent of each other, but through the proposed Mutual Transposed Cross Attention (MTCA), the mutual dependence of the two is realized, thereby strengthening the joint modeling of local and global. Further, we propose Bidirectional Injection Module (BIM), which makes the global and local features adapt to each other in parallel before fusion and greatly reduces interference in the fusion process caused by semantic gap. The proposed method is qualitatively and quantitatively evaluated on multiple benchmark datasets, and extensive experiments show that our method reaches the state-of-the-art with low computational consumption. |
---|---|
ISSN: | 1573-7721 1380-7501 1573-7721 |
DOI: | 10.1007/s11042-023-16724-9 |