Loading…

On Input/Output Architectures for Convolutional Neural Network-Based Cross-View Gait Recognition

In this paper, we discuss input/output architectures for convolutional neural network (CNN)-based cross-view gait recognition. For this purpose, we consider two aspects: verification versus identification and the tradeoff between spatial displacements caused by subject difference and view difference...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on circuits and systems for video technology 2019-09, Vol.29 (9), p.2708-2719
Main Authors: Takemura, Noriko, Makihara, Yasushi, Muramatsu, Daigo, Echigo, Tomio, Yagi, Yasushi
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this paper, we discuss input/output architectures for convolutional neural network (CNN)-based cross-view gait recognition. For this purpose, we consider two aspects: verification versus identification and the tradeoff between spatial displacements caused by subject difference and view difference. More specifically, we use the Siamese network with a pair of inputs and contrastive loss for verification and a triplet network with a triplet of inputs and triplet ranking loss for identification. The aforementioned CNN architectures are insensitive to spatial displacement, because the difference between a matching pair is calculated at the last layer after passing through the convolution and max pooling layers; hence, they are expected to work relatively well under large view differences. By contrast, because it is better to use the spatial displacement to its best advantage because of the subject difference under small view differences, we also use CNN architectures where the difference between a matching pair is calculated at the input level to make them more sensitive to spatial displacement. We conducted experiments for cross-view gait recognition and confirmed that the proposed architectures outperformed the state-of-the-art benchmarks in accordance with their suitable situations of verification/identification tasks and view differences.
ISSN:1051-8215
1558-2205
DOI:10.1109/TCSVT.2017.2760835