Loading…

The NGT200 Dataset: Geometric Multi-View Isolated Sign Recognition

Sign Language Processing (SLP) provides a foundation for a more inclusive future in language technology; however, the field faces several significant challenges that must be addressed to achieve practical, real-world applications. This work addresses multi-view isolated sign recognition (MV-ISR), an...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2024-09
Main Authors: Ranum, Oline, Wessels, David R, Gomer Otterspeer, Bekkers, Erik J, Roelofsen, Floris, Andersen, Jari I
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Sign Language Processing (SLP) provides a foundation for a more inclusive future in language technology; however, the field faces several significant challenges that must be addressed to achieve practical, real-world applications. This work addresses multi-view isolated sign recognition (MV-ISR), and highlights the essential role of 3D awareness and geometry in SLP systems. We introduce the NGT200 dataset, a novel spatio-temporal multi-view benchmark, establishing MV-ISR as distinct from single-view ISR (SV-ISR). We demonstrate the benefits of synthetic data and propose conditioning sign representations on spatial symmetries inherent in sign language. Leveraging an SE(2) equivariant model improves MV-ISR performance by 8%-22% over the baseline.
ISSN:2331-8422