Loading…
Amazigh CNN speech recognition system based on Mel spectrogram feature extraction method
The field of speech recognition makes it simpler for humans and machines to engage with speech. Number-oriented communication, such as using a registration code, mobile number, score, or account number, can benefit from speech recognition for digits. This paper presents our Amazigh automatic speech...
Saved in:
Published in: | International journal of speech technology 2024-03, Vol.27 (1), p.287-296 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The field of speech recognition makes it simpler for humans and machines to engage with speech. Number-oriented communication, such as using a registration code, mobile number, score, or account number, can benefit from speech recognition for digits. This paper presents our Amazigh automatic speech recognition (ASR) experience based on the deep learning approach. The convolutional neural network (CNN) and Mel spectrogram are exploited to evaluate audio samples and produce spectrograms as a part of the deep learning strategy. To attempt the recognition of the Amazigh numerals, we use a database that includes digits ranging from zero to nine collected from 42 native speakers in total, men and women between the ages of 20 and 40. Our experimental results show that spoken digits in Amazigh can be identified with a maximum accuracy of 93.62%, 94% Precision, and 94% Recall. |
---|---|
ISSN: | 1381-2416 1572-8110 |
DOI: | 10.1007/s10772-024-10100-0 |