Loading…
Two-stage aware attentional Siamese network for visual tracking
•We propose a novel two-stage aware training framework for siamese networks, in which position-aware and appearance-aware training schemes are presented to optimize the shallow and the deep network layers, respectively. This contribution helps siamese tracker to achieve precise and robust visual tra...
Saved in:
Published in: | Pattern recognition 2022-04, Vol.124, p.108502, Article 108502 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | •We propose a novel two-stage aware training framework for siamese networks, in which position-aware and appearance-aware training schemes are presented to optimize the shallow and the deep network layers, respectively. This contribution helps siamese tracker to achieve precise and robust visual tracking.•An effective feature selection module is presented to solve the online adaptation problem of Siamese tracker. By analyzing the changing principle of feature distribution, the module combines diverse attention networks in a unique way to explore the real discriminative features for the current object.•The proposed tracker is evaluated on four popular benchmark datasets extensively. The results demonstrate that the tracker performs better than other state-of-the-art methods in terms of accuracy and robustness.
Siamese networks have achieved great success in visual tracking with the advantages of speed and accuracy. However, how to track an object precisely and robustly still remains challenging. One reason is that multiple types of features are required to achieve good precision and robustness, which are unattainable by a single training phase. Moreover, Siamese networks usually struggle with online adaption problem. In this paper, we present a novel two-stage aware attentional Siamese network for tracking (Ta-ASiam). Concretely, we first propose a position-aware and an appearance-aware training strategy to optimize different layers of Siamese network. By introducing diverse training patterns, two types of required features can be captured simultaneously. Then, following the rule of feature distribution, an effective feature selection module is constructed by combining both channel and spatial attention networks to adapt to rapid appearance changes of the object. Extensive experiments on various latest benchmarks have well demonstrated the effectiveness of our method, which significantly outperforms state-of-the-art trackers. |
---|---|
ISSN: | 0031-3203 1873-5142 |
DOI: | 10.1016/j.patcog.2021.108502 |