Loading…

Thermal infrared and visible sequences tracking via dual adversarial pixel fusion

Due to the strong complementary strengths of visible light and thermal infrared light, to overcome the limitations of visible light imaging, combining these two modes for target tracking has received considerable interest and has achieved rapid development. The original intention of introducing ther...

Full description

Saved in:
Bibliographic Details
Published in:Multimedia tools and applications 2023-12, Vol.83 (40), p.88303-88322
Main Authors: Zheng, Hang, Yuan, Nangezi, Ding, Hongwei, Hu, Peng, Yang, Zhijun
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Due to the strong complementary strengths of visible light and thermal infrared light, to overcome the limitations of visible light imaging, combining these two modes for target tracking has received considerable interest and has achieved rapid development. The original intention of introducing thermal infrared images in the field of visual tracking is to use complementary benefits of the two modalities. Therefore, how to mine more useful and complementary information from thermal infrared (T) images is the key to achieve high-quality tracking in the case of poor visible (RGB) images quality. The existing algorithms of image fusion or feature stitching do not fully exploit the correlation and complementary information between RGB and T images, and are prone to cause feature redundancy and susceptible to interference during tracking. The reliability of single-modal data changes over time, which affects the effectiveness of feature-level modal sharing, while the RGBT images obtained from pixel-level fusion contains richer information than single-peak images, which are more conducive to the sharing of modal information and the detection of tracked targets. Thus, we design a dual adversarial pixel fusion network to adaptively fuse two modal images to generate superpixels to perform modal sharing of RGBT target tracking. In addition, to obtain a more accurate tracking results, we design a new weight fusion method to infer the fusion weights of the three modalities RGB, T and RGBT to find the optimal target in each frame. We have improved the MANet algorithm using the two methods described above. A large number of experiments on two RGBT tracking datasets show that compared with the original MANet target tracking method, the proposed method achieves better tracking results in accuracy and success rate.
ISSN:1380-7501
1573-7721
DOI:10.1007/s11042-023-17721-8