Loading…
Deep learning-based few-shot person re-identification from top-view RGB and depth images
Person re-identification (re-id) attempts to match a person from the images of different time steps. Existing deep learning approaches either use appearance or geometry features for re-id which does not provide the required robustness because of higher intra-class similarity. Existing supervised re-...
Saved in:
Published in: | Neural computing & applications 2024-11, Vol.36 (31), p.19365-19382 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Person re-identification (re-id) attempts to match a person from the images of different time steps. Existing deep learning approaches either use appearance or geometry features for re-id which does not provide the required robustness because of higher intra-class similarity. Existing supervised re-id approaches utilize Convolutional Neural Networks (CNNs) and identity-labeled images to train, where the person images are taken by the sensors from a horizontal view. The horizontal view exposes the privacy of the people because of their facial appearance in the image. Moreover, person re-id includes new unseen people; however, CNN does not have the ability to identify the new unseen people because of a lack of continual learning. Privacy-preserved computer vision-assisted person re-id systems can benefit from visual appearance and geometry features extracted from top-view RGB and depth input. This paper presents the privacy-preserved person top-view re-id few-shot network which uses the appearance and geometry features. The EfficientNet is used for appearance-based features from RGB input, while PointNet is used to extract the geometry features from the point cloud which is made from the RGB-D image registration. Concatenated features from EfficientNet and PointNet are fed to the two-layer Bi-LSTM network for person identification. Finally, the whole network is converted into a few-shot network to achieve continual learning by removing the output layer and joining the similarity measurement unit. This approach is based on CNN and fine-tunes a TVPR/2 dataset acquired by using a top-view arrangement that is publicly available. The experimental results on TVPR/2 and GODPR datasets show that the proposed re-id network outperforms other state-of-the-art networks. |
---|---|
ISSN: | 0941-0643 1433-3058 |
DOI: | 10.1007/s00521-024-10239-6 |