Loading…

Real-time hand gesture recognition using multiple deep learning architectures

Human gesture recognition is one of the most challenging problems in computer vision, striving to analyze human gestures by machine. However, most of the literature on gesture recognition utilizes isolated data with only one gesture in one image or a video for classifying gestures. This work targets...

Full description

Saved in:
Bibliographic Details
Published in:Signal, image and video processing image and video processing, 2023-11, Vol.17 (8), p.3963-3971
Main Authors: Aggarwal, Apeksha, Bhutani, Nikhil, Kapur, Ritvik, Dhand, Geetika, Sheoran, Kavita
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Human gesture recognition is one of the most challenging problems in computer vision, striving to analyze human gestures by machine. However, most of the literature on gesture recognition utilizes isolated data with only one gesture in one image or a video for classifying gestures. This work targets the identification of human gestures from the continuous stream of data input taken from a live camera feed, with no pre-defined boundaries. This task becomes even more complex given the diverse lighting conditions, varying backgrounds and different gesture positions in the same input stream of data. This work presents an effective deep learning architecture to classify gestures taken from multiple viewpoints and varying object sizes. To perform the classification, in this work, we have synthesized a real-world dataset consisting of 4500 images collected from different persons of varying age groups ranging from 10 to 50. The dataset is accumulated considering a wide variety of characteristics to address the complexities in the gesture recognition process. A real-time system is developed that captures, analyzes and classifies live gesture videos frame by frame. To prove the validity of our approach, we have compared our results with multiple deep learning architectures and other benchmark datasets. The results depict that our approach outperforms the existing works and is able to detect gestures with deteriorating lighting conditions and murky gesture positions, achieving an accuracy of 99.63%.
ISSN:1863-1703
1863-1711
DOI:10.1007/s11760-023-02626-8