Loading…
Evaluating MFCC-based speaker identification systems with data envelopment analysis
The concept of the efficiency of speaker recognition systems varies in the literature. Although many authors have defined efficiency as recognition accuracy, others have defined it as low energy consumption, memory storage, or computational burden. In our study, for a novel approach, speaker recogni...
Saved in:
Published in: | Expert systems with applications 2021-04, Vol.168, p.114448, Article 114448 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The concept of the efficiency of speaker recognition systems varies in the literature. Although many authors have defined efficiency as recognition accuracy, others have defined it as low energy consumption, memory storage, or computational burden. In our study, for a novel approach, speaker recognition was evaluated following a multi-criteria decision-making approach in two stages. First, speaker identification based on Mel-frequency cepstrum coefficients (MFCC) was conducted for various parameters and methods, including number of speakers, number of MFCCs, test speech duration, training utterance length and the various classifiers. Classification metrics, memory storage, testing, and training time of the trials were measured as well, and the performance of the trials was examined for each criterion. Verifying the literature, the study revealed that no parameters or methods achieved the best performance for all criteria. In the second stage, a multi-criteria efficiency analysis, as suggested in the literature, was conducted according to various application scenarios. By using data envelopment analysis, the efficiency of trials according to the scenarios was determined. After ranking the efficiency scores, it was revealed that the best solution was task-dependent. From the perspective of classifiers, artificial neural networks outperformed the others considering benefits to cost; however, some of their costs were high, whereas the other classifiers provided the best solutions in light of cost criteria. Last, the number of MFCCs was the least effective parameter for efficiency. Altogether, the findings indicate that the efficiency of a speaker identification system cannot be defined as recognition accuracy, memory storage, testing time or training time but as a function of those criteria.
•The performance of ASI systems was evaluated according to various criteria.•The efficiencies of the systems were measured in a task-dependent fashion.•Efficiency was represented as a function of predefined criteria.•The study shows that no method outperforms the rest for all criteria.•The most accurate system may not be the most efficient one, and vice versa. |
---|---|
ISSN: | 0957-4174 1873-6793 |
DOI: | 10.1016/j.eswa.2020.114448 |