Loading…

Generalized zero-shot action recognition through reservation-based gate and semantic-enhanced contrastive learning

Generalized zero-shot action recognition (GZSAR) aims to classify actions from both the classes seen in the training phase and unseen classes for which no samples are available. Since all training samples are derived from seen classes, conducting classification directly in a combined space that enco...

Full description

Saved in:
Bibliographic Details
Published in:Knowledge-based systems 2024-10, Vol.301, p.112283, Article 112283
Main Authors: Shang, Junyuan, Niu, Chang, Tao, Xiyuan, Zhou, Zhiheng, Yang, Junmei
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Generalized zero-shot action recognition (GZSAR) aims to classify actions from both the classes seen in the training phase and unseen classes for which no samples are available. Since all training samples are derived from seen classes, conducting classification directly in a combined space that encompasses both seen and unseen classes would introduce a competition between the predicted scores of seen and unseen classes, potentially resulting in misclassification of unseen test samples as seen ones. Besides, existing generative methods rely solely on the provided class-level semantic features and overlook the exploration of interrelations among the semantic features, thereby limiting the quality of the generated features. In this paper, we tackle GZSAR through a novel method known as reservation-based gate and semantic-enhanced contrastive learning (RGSCL). We introduce a reserved classifier and optimize it with constructed fictive samples to learn the reservation-based gate which avoids the competition and alleviates the impact of biased classification scores towards seen classes. Further, we propose to conduct contrastive learning based on the hypersphere-based enhanced semantic features, aiming to ensure the generated features maintain a consistent relationship with corresponding semantic features, thereby improving the comprehension of the generator to the semantic interrelations. RGSCL exhibits strong compatibility with existing generative GZSAR methods. Extensive experimental results on three datasets of both conventional zero-shot and generalized zero-shot settings showcase the effectiveness of the proposed RGSCL.
ISSN:0950-7051
DOI:10.1016/j.knosys.2024.112283