Loading…

Heterogeneous Generative Tokens and Distance-aware Recovery Network for Occluded Person Re-identification

In real-world surveillance scenarios, person re-identification tasks are often seriously affected by occlusion problems, which requires the model to be able to not only extract powerful features, but also effectively recover features when they are occluded. Although existing methods disentangle visi...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on circuits and systems for video technology 2024-12, p.1-1
Main Authors: Li, Zhihao, Zhang, Huaxiang, Zhu, Lei, Sun, Jiande, Liu, Li
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In real-world surveillance scenarios, person re-identification tasks are often seriously affected by occlusion problems, which requires the model to be able to not only extract powerful features, but also effectively recover features when they are occluded. Although existing methods disentangle visible human bodies by clustering semantic information, they often damage discriminative appearance due to the introduction of background noises. To solve this problem, we propose Heterogeneous Generative Tokens and Distance-aware Recovery (HGTDR) network, which aims to effectively extract discriminative appearance and recover the occluded body regions. HGTDR mainly contains two branches: a holistic stream and a part stream. The holistic stream utilizes ViT to capture the global context information and provide stable global features by establishing long-range relationships. In the part stream, we propose a Semantic Patch Generator (SPG), which combines the local attention mechanism to capture rich local semantics and further generate semantic patches. Further, considering the discrimination score and relevance score of semantic patches, we feed them into the proposed Adaptive Heterogeneous Semantic Token Generator (AHSTG) to gradually generate strong-response foreground and weak-response background features. In addition, to complete the features of occluded regions, the Distance-based Feature Recovery (DFR) module is designed. The module calculates the planar Euclidean distance of heterogeneous tokens and adaptively allocates the corresponding weights to dynamically recover the invisible bodies. Finally, we obtain discriminative and robust person descriptors. Extensive experiments on several challenging occluded, partial and holistic Re-ID datasets demonstrate that our proposed HGTDR network achieves superior performance and outperforms various state-of-the-art methods.
ISSN:1051-8215
DOI:10.1109/TCSVT.2024.3519312