Loading…

A Conditional Diffusion Model With Fast Sampling Strategy for Remote Sensing Image Super-Resolution

Conventional deep learning-based methods for single remote sensing image super-resolution (SRSISR) have made remarkable progress. However, the super-resolution (SR) outputs of these methods are yet to become sufficiently satisfactory in visual quality. Recent diffusion model-based generative deep le...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on geoscience and remote sensing 2024, Vol.62, p.1-16
Main Authors:	Meng, Fanen, Chen, Yijun, Jing, Haoyu, Zhang, Laifu, Yan, Yiming, Ren, Yingchao, Wu, Sensen, Feng, Tian, Liu, Renyi, Du, Zhenhong
Format:	Article
Language:	English
Subjects:	Artificial neural networks Attention Computational modeling Conditional diffusion model Deep learning Diffusion Diffusion models Diffusion rate generative models Image enhancement Image quality Image resolution Machine learning Neural networks Remote sensing Sampling Signal to noise ratio Source code super-resolution (SR) Superresolution Training Transformers Visual discrimination learning Visual perception Visualization
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Conventional deep learning-based methods for single remote sensing image super-resolution (SRSISR) have made remarkable progress. However, the super-resolution (SR) outputs of these methods are yet to become sufficiently satisfactory in visual quality. Recent diffusion model-based generative deep learning models are capable to enhance the visual quality of output images, but this capability is limited due to their sampling efficiency. In this article, we propose FastDiffSR, an SRSISR method based on a conditional diffusion model. Specifically, we devise a novel sampling strategy to reduce the number of sampling steps required by the diffusion model while ensuring the sampling quality. Meanwhile, the residual image is adopted to reduce computational costs, demonstrating that integrating channel attention and spatial attention begets a further improvement in the visual quality of output images. Compared to the state-of-the-art (SOTA) convolutional neural network (CNN)-based, GAN-based, and Transformer-based SR methods, our FastDiffSR improves the learned perceptual image patch similarity (LPIPS) by 0.1-0.2 and achieves better visual results in some real-world scenes. Compared with existing diffusion-based SR methods, our FastDiffSR achieves significant improvements in pixel-level evaluation metric peak signal-noise ratio (PSNR) while having smaller model parameters and obtaining better SR results on Vaihingen data with faster inference time by 2.8-28 times, showing excellent generalization ability and time efficiency. Our code will be open source at https://github.com/Meng-333/FastDiffSR .
ISSN:	0196-2892 1558-0644
DOI:	10.1109/TGRS.2024.3458009