Loading…
Generalized Intra-Camera Supervised Person Re-Identification
Person re-identification (Re-ID) is to match the images of the same person from different camera views, which demands a view-invariant feature embedding. Recently, intra-camera supervised (ICS) Re-ID develops the Re-ID models without cross-view annotated data. Existing ICS methods are developed base...
Saved in:
Published in: | IEEE transactions on circuits and systems for video technology 2024-06, Vol.34 (6), p.4516-4527 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Person re-identification (Re-ID) is to match the images of the same person from different camera views, which demands a view-invariant feature embedding. Recently, intra-camera supervised (ICS) Re-ID develops the Re-ID models without cross-view annotated data. Existing ICS methods are developed based on the assumptions, such as assuming each person in the training set appears under multiple cameras. However, there is no guarantee that the assumptions are true without cross-view annotations, and their performance degrades when the assumptions are violated. In this work, we generalize the ICS Re-ID and develop an ICS Re-ID model without the assumptions. The absence of prior assumptions and cross-view annotations poses a challenge in exploiting the discriminative information among cross-view images. To this end, we propose to mine the view-invariant relations between cross-view images for Re-ID model to exploit discriminative information and overcome the cross-view variations. Specifically, we learn composited view-aware features by compositing the identity information with different camera view information in the feature composition module. Then, we exploit the composited features to model various view-aware relations between pairwise images. By mining the common patterns among the view-aware relations, we obtain the view-invariant pairwise relation for learning. Besides, leveraging the composited view-aware features, we develop a view-aware marginal constraint for robust cross-view learning. To facilitate learning the feature composition module, we augment an auxiliary network to exploit the camera view information at the feature level. Extensive experimental results show the effectiveness of our method under different scenarios. |
---|---|
ISSN: | 1051-8215 1558-2205 |
DOI: | 10.1109/TCSVT.2023.3340346 |