Loading…

Speech/music classification using visual and spectral chromagram features

Automatic speech/music classification is an important tool in multimedia content analysis and retrieval which efficiently categorizes input audio and store it into relevant classes. This article proposes use of chromagram textural and spectral features for speech and music classification. Chromagram...

Full description

Saved in:
Bibliographic Details
Published in:Journal of ambient intelligence and humanized computing 2020, Vol.11 (1), p.329-347
Main Authors: Birajdar, Gajanan K., Patil, Mukesh D.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Automatic speech/music classification is an important tool in multimedia content analysis and retrieval which efficiently categorizes input audio and store it into relevant classes. This article proposes use of chromagram textural and spectral features for speech and music classification. Chromagram textural feature set is based on transforming the input audio into a chromagram image representation and then extracting uniform local binary pattern textural descriptors. Chroma spectral features involves novel chroma bin features which exploits music tonality present in the music signal. The optimal feature subset from the original feature set is selected using eigenvector centrality based feature selection, removing the redundant and irrelevant features and further enhancing the prediction performance. The performance of the algorithm is evaluated using S&S, GTZAN and MUSAN databases providing the advantage and suitability of both chroma spectral and visual features for the classification task. Extensive experiments performed using support vector machine classifier shows that the chromagram textural descriptors outperform other state-of-the-art approaches. Besides, good results are also achieved in the mismatched training and testing.
ISSN:1868-5137
1868-5145
DOI:10.1007/s12652-019-01303-4