Loading…
Robust Tracking via Combing Top-Down and Bottom-Up Attention
Transformer attention plays an important role in current top-performing trackers. However, it is bottom-up, driven by stimulus and lacks intrinsic prior guidance. This bottom-up attention mechanism leads to an emphasis on all objects in the input images, rather than the task related objects. As a re...
Saved in:
Published in: | IEEE transactions on circuits and systems for video technology 2024-10, Vol.34 (10), p.9774-9785 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Transformer attention plays an important role in current top-performing trackers. However, it is bottom-up, driven by stimulus and lacks intrinsic prior guidance. This bottom-up attention mechanism leads to an emphasis on all objects in the input images, rather than the task related objects. As a result, the performance of the bottom-up attention based trackers is deteriorated in complicated scenes. To address this issue, we propose a robust tracker that combines bottom-up attention with top-down attention to comply with the existing ViT framework, named TBTrack. TBTrack can not only utilize the existing bottom-up attention mechanisms to model the long-range relationship of input tokens, but also utilize a newly added top-down attention mechanism to pay more attention to task related object and further eliminate interference from similar objects and backgrounds. Specifically, we firstly design a top-down prior generation module using an adaptive learning parameter combined with the template inputs to obtain top-down task guided signals. Then, we inject the prior signals into a bottom-up attention module to obtain a top-down and bottom-up attention combination block (TB-Block). Finally, we stack these TB-Blocks to construct our tracker (TBTrack) with top-down prior guidance capability, which focuses more on the task related object. Through extensive experiments, our TBTrack achieves impressive performance on multiple tracking benchmarks, including GOT-10k, LaSOT, LaSOT _{ext} , TNL2K, TrackingNet, UAV123 and so on. The code and trained models will be publicly available. |
---|---|
ISSN: | 1051-8215 1558-2205 |
DOI: | 10.1109/TCSVT.2024.3402436 |