Loading…

Real time word gesture detection and performance analysis using RCNN and RNN algorithms along with speech generation

This study compares the effectiveness of two techniques for real-time word gesture detection: the RCNN algorithm and the RNN with Mediapipe algorithm. Furthermore, the research looks at the incorporation of speech generation to improve the user experience and enable seamless connection between human...

Full description

Saved in:

Bibliographic Details
Main Authors:	Jawre, Bhushan, Vineetha, K. V.
Format:	Conference Proceeding
Language:	English
Subjects:	Algorithms Artificial neural networks Character recognition Human performance Object recognition Real time Recurrent neural networks Sequences Sign language Speech recognition Technology assessment User experience User interfaces Words (language)
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	This study compares the effectiveness of two techniques for real-time word gesture detection: the RCNN algorithm and the RNN with Mediapipe algorithm. Furthermore, the research looks at the incorporation of speech generation to improve the user experience and enable seamless connection between humans and robots. The goal is to examine the performance of different techniques and assess their practical applicability in areas such as human-computer interaction and assistive technology. The outcomes of the experiments show the benefits and limits of each strategy, offering insight on their distinct strengths and shortcomings. The RNN design supports sequential learning and memory retention, making it ideal for replicating the sequential character of sign language. The RCNN method, on the other hand, detects and recognizes sign language motions using a region-based convolutional neural network architecture. RCNN models can successfully localize and categorize objects inside pictures, making them useful for recognizing hand forms and motions in sign language movies or frames. The RCNN architecture gathers visual information from predetermined zones of interest, enabling for precise and robust sign language identification. Both techniques have benefits and drawbacks. RNN with MediaPipe has outstanding temporal modelling skills, allowing for reliable identification of sign language sequences as well as the capacity to manage fluctuations in timing and motion. This paper also mentions about the speech generation system after the recognizing the gesture by the addition of which we want to help improve accessible communication tools for people with disabilities. It uses PYTT3 python module for the generation of speech.
ISSN:	0094-243X 1551-7616
DOI:	10.1063/5.0226657