Loading…
Consistent Representation Mining for Multi-Drone Single Object Tracking
Aerial tracking has received growing attention due to its broad practical applications. However, single-view aerial trackers are still limited by challenges such as severe appearance variations and occlusions. Existing multi-view trackers utilize cross-drone information to address these issues but s...
Saved in:
Published in: | IEEE transactions on circuits and systems for video technology 2024-11, Vol.34 (11), p.10845-10859 |
---|---|
Main Authors: | , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Aerial tracking has received growing attention due to its broad practical applications. However, single-view aerial trackers are still limited by challenges such as severe appearance variations and occlusions. Existing multi-view trackers utilize cross-drone information to address these issues but struggle to overcome heterogenous differences. In this paper, we propose a novel Transformer-based consistent representation mining (CRM) module to capture invariant target information and suppress the heterogenous differences in cross-drone information. First, CRM divides the heterogenous input into regions and measures semantic relevance by modeling the relations between these regions. Then reliable target regions are roughly localized by selecting the top k most relevant regions. Next, the global perception is performed on these reliable regions via multi-head sparse self-attention, further enhancing the understanding of the target and suppressing background regions. In particular, CRM, as a plug-and-play module, can be flexibly embedded into different tracking frameworks (CRM-Siam and CRM-DiMP). Besides, the multi-view correction strategy is designed to ensure timely correction of multi-view information and full utilization of its own information. Extensive experiments on the multi-drone dataset, MDOT, demonstrate that CRM-assisted trackers effectively improve the accuracy and robustness of the multi-drone tracking system, outperforming other outstanding trackers. The code and models are available at https://github.com/xyl-507/CRM . |
---|---|
ISSN: | 1051-8215 1558-2205 |
DOI: | 10.1109/TCSVT.2024.3411301 |