Loading…

Evaluating MFCC-based speaker identification systems with data envelopment analysis

The concept of the efficiency of speaker recognition systems varies in the literature. Although many authors have defined efficiency as recognition accuracy, others have defined it as low energy consumption, memory storage, or computational burden. In our study, for a novel approach, speaker recogni...

Full description

Saved in:

Bibliographic Details
Published in:	Expert systems with applications 2021-04, Vol.168, p.114448, Article 114448
Main Authors:	Özcan, Zübeyir, Kayıkçıoğlu, Temel
Format:	Article
Language:	English
Subjects:	Artificial neural networks Classifiers Data analysis Data envelopment analysis Decision making Efficiency Energy consumption Energy storage MFCC features Multi-criteria decision making Multiple criterion Parameters Speaker identification Speaker recognition evaluation Speech recognition Testing time Training
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	The concept of the efficiency of speaker recognition systems varies in the literature. Although many authors have defined efficiency as recognition accuracy, others have defined it as low energy consumption, memory storage, or computational burden. In our study, for a novel approach, speaker recognition was evaluated following a multi-criteria decision-making approach in two stages. First, speaker identification based on Mel-frequency cepstrum coefficients (MFCC) was conducted for various parameters and methods, including number of speakers, number of MFCCs, test speech duration, training utterance length and the various classifiers. Classification metrics, memory storage, testing, and training time of the trials were measured as well, and the performance of the trials was examined for each criterion. Verifying the literature, the study revealed that no parameters or methods achieved the best performance for all criteria. In the second stage, a multi-criteria efficiency analysis, as suggested in the literature, was conducted according to various application scenarios. By using data envelopment analysis, the efficiency of trials according to the scenarios was determined. After ranking the efficiency scores, it was revealed that the best solution was task-dependent. From the perspective of classifiers, artificial neural networks outperformed the others considering benefits to cost; however, some of their costs were high, whereas the other classifiers provided the best solutions in light of cost criteria. Last, the number of MFCCs was the least effective parameter for efficiency. Altogether, the findings indicate that the efficiency of a speaker identification system cannot be defined as recognition accuracy, memory storage, testing time or training time but as a function of those criteria. •The performance of ASI systems was evaluated according to various criteria.•The efficiencies of the systems were measured in a task-dependent fashion.•Efficiency was represented as a function of predefined criteria.•The study shows that no method outperforms the rest for all criteria.•The most accurate system may not be the most efficient one, and vice versa.
ISSN:	0957-4174 1873-6793
DOI:	10.1016/j.eswa.2020.114448