Loading…

Speech recognition

The authors have designed, successfully trained and tested an Arabic speech recognition system. This system was implemented using C++ programming language on Windows 95. It can be partitioned into five main modules. These modules are the front-end, feature extraction, training, pattern recognition a...

Full description

Saved in:
Bibliographic Details
Published in:IEEE potentials 1998-02, Vol.17 (1), p.23-28
Main Authors: Alotaibi, Y.A., Shahsavari, M.M.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The authors have designed, successfully trained and tested an Arabic speech recognition system. This system was implemented using C++ programming language on Windows 95. It can be partitioned into five main modules. These modules are the front-end, feature extraction, training, pattern recognition and decision making and display. The front-end module functions as signal preparation and calibration. This includes: setting the signal sampling rate, removing the DC component from the signal, setting the scaling factor of the signal and detecting the endpoints of the utterance. The endpoint task removes the non-speech signal portions created by the speaker's pauses. This reduces the system computation time needed and the memory requirements. The feature extraction module is mainly a digital signal processor. The training module is the one that finds the best templates for every word or sound (phonemes) in the system's database. In short, this module needs to be executed only one time before users can utilize the system. The next module is the pattern recognition module. Its function is to compare the given utterance (test utterance) to all the stored templates (the reference module). The decision and display module functions as an interface between the user and the hidden system modules. In other words, after getting the recognition module results, this module displays the best candidate(s) and/or their likelihood percentage. The error rates are computed and displayed in this module.
ISSN:0278-6648
1558-1772
DOI:10.1109/45.652853