Loading…

Cross-lingual few-shot sign language recognition

There are over 150 sign languages worldwide, each with numerous local variants and thousands of signs. However, collecting annotated data for each sign language to train a model is a laborious and expert-dependent task. To address this issue, this paper introduces the problem of few-shot sign langua...

Full description

Saved in:
Bibliographic Details
Published in:Pattern recognition 2024-07, Vol.151, p.110374, Article 110374
Main Authors: Bilge, Yunus Can, Ikizler-Cinbis, Nazli, Cinbis, Ramazan Gokberk
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:There are over 150 sign languages worldwide, each with numerous local variants and thousands of signs. However, collecting annotated data for each sign language to train a model is a laborious and expert-dependent task. To address this issue, this paper introduces the problem of few-shot sign language recognition (FSSLR) in a cross-lingual setting. The central motivation is to be able to recognize a novel sign, even if it belongs to a sign language unseen during training, based on a small set of examples. To tackle this problem, we propose a novel embedding-based framework that first extracts a spatio-temporal visual representation based on video and hand features, as well as hand landmark estimates. To establish a comprehensive test bed, we propose three meta-learning FSSLR benchmarks that span multiple languages, and extensively evaluate the proposed framework. The experimental results demonstrate the effectiveness and superiority of the proposed approach for few-shot sign language recognition in both monolingual and cross-lingual settings. •The motivation of the problem is to recognize a novel sign based on a small set of examples.•A novel framework leverages signer body and hand features for embedding is proposed.•Three novel meta-learning benchmarks that span multiple languages are introduced.•Our embedding framework achieves the best performance in three proposed benchmarks.
ISSN:0031-3203
1873-5142
DOI:10.1016/j.patcog.2024.110374