Loading…

A Deep Learning Framework for Infrared and Visible Image Fusion Without Strict Registration

In recent years, although significant progress has been made in infrared and visible image fusion, existing methods typically assume that the source images have been rigorously registered or aligned prior to image fusion. However, the difference in modalities of infrared and visible images poses a g...

Full description

Saved in:
Bibliographic Details
Published in:International journal of computer vision 2024-05, Vol.132 (5), p.1625-1644
Main Authors: Li, Huafeng, Liu, Junyu, Zhang, Yafei, Liu, Yu
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In recent years, although significant progress has been made in infrared and visible image fusion, existing methods typically assume that the source images have been rigorously registered or aligned prior to image fusion. However, the difference in modalities of infrared and visible images poses a great challenge to achieve strict alignment automatically, affecting the quality of the subsequent fusion procedure. To address this problem, this paper proposes a deep learning framework for misaligned infrared and visible image fusion, aiming to free the fusion algorithm from strict registration. Technically, we design a convolutional neural network (CNN)-Transformer Hierarchical Interactive Embedding (CTHIE) module, which can combine the respective advantages of CNN and Transformer, to extract features from the source images. In addition, by characterizing the correlation between the features extracted from misaligned source images, a Dynamic Re-aggregation Feature Representation (DRFR) module is devised to align the features with a self-attention-based feature re-aggregation scheme. Finally, to effectively utilize the features at different levels of the network, a Fully Perceptual Forward Fusion (FPFF) module via interactive transmission of multi-modal features is introduced for feature fusion to reconstruct the fused image. Experimental results on both synthetic and real-world data demonstrate the effectiveness of the proposed method, verifying the feasibility of directly fusing infrared and visible images without strict registration.
ISSN:0920-5691
1573-1405
DOI:10.1007/s11263-023-01948-x