Loading…

Audio compression with multi-algorithm fusion and its impact in speech emotion recognition

The study examines the impact of multi-algorithm fusion over audio compression with reference to the traditional exercises. For emotion recognition, here the most prominent features ‘Mel Frequency Cepstral Coefficients’ (MFCC) and ‘Discrete Wavelet Transform’ (DWT) features are extracted from preval...

Full description

Saved in:
Bibliographic Details
Published in:International journal of speech technology 2020-06, Vol.23 (2), p.277-285
Main Authors: Reddy, A. Pramod, Vijayarajan, V.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The study examines the impact of multi-algorithm fusion over audio compression with reference to the traditional exercises. For emotion recognition, here the most prominent features ‘Mel Frequency Cepstral Coefficients’ (MFCC) and ‘Discrete Wavelet Transform’ (DWT) features are extracted from prevalent speech samples of Berlin emotional database and Telugu (a south Indian language) database, we proposed automatic emotion recognition system (AERS) based on multi-algorithms fusion. AERS means to monitor and identify unit psychological/emotional state. The extracted features are analyzed using support vector machine, K-NN algorithms used for the classification of different states of emotion. Using two state-of-art mp3, Speex codec with different bit-rates investigated to ensure specific emotional intelligibility. MP3 codec configuration with 96   kbps bit-rate is recommended to pull off high compression for all emotions. Fusion algorithms also performed well compared with individual algorithms. Accuracy of 94.2% using fusion DWT and MFCC compared to 89.1% using DWT and 91.38% using MFCC separately. The accuracy of the proposed method increased further to 94% through a multiresolution approach by approximating frequency along with time information.
ISSN:1381-2416
1572-8110
DOI:10.1007/s10772-020-09689-9