Loading…
Automatic tuning of radio stations based on listener’s preference using Software Defined Radio and MATLAB
This work introduces a real-time system to automate the selection of radio stations based on the listener’s preference (either speech/music) by analyzing the incoming audio signals using a speech/music classifier (SMC) using machine learning approaches. Radio Frequency data from different Frequency...
Saved in:
Published in: | Engineering applications of artificial intelligence 2024-11, Vol.137, p.109117, Article 109117 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | This work introduces a real-time system to automate the selection of radio stations based on the listener’s preference (either speech/music) by analyzing the incoming audio signals using a speech/music classifier (SMC) using machine learning approaches. Radio Frequency data from different Frequency Modulated (FM) stations are directly read in MATLAB using the Communications Toolbox Support Package for Register Transfer Level-Software Defined Radio (RTL-SDR). Further, the work is divided into two phases. Initially, the efficiency of different state-of-the-art features for designing a speech/music classifier is studied on new speech/music corpora developed for Indian radio stations. Box plots and Region of Convergence (ROC) plots were used to study feature importance. The learning from the experiments was used to select optimum sets of features for efficient working of the real-time model. The models were tested for both offline and online test data. The best-performing features (Mel-Frequency Cepstral Coefficients (MFCC) + Variance Spectral Roll-off) vectors were then concatenated to obtain the best classification accuracy of 93.06% and 83.91% for offline and real-time data, respectively, using Gaussian Mixture Model (GMM) classifier. We also studied the efficiency of recently proposed Empirical Mode decomposition (EMD)-based statistical and Hilbert Spectrum-based features on both standard (Slaney, GTZAN, Musan dataset) and newly created datasets. We achieved an overall accuracy of 94.91%, 93.20%, 91.72%, and 95.40% for SS, GTZAN, MUSAN, and BITM datasets respectively. However, concatenating several features increased the latency of the algorithm. This work also proposes a new speech/music corpus based on recordings from Indian radio stations in the Hindi language.
•Novel real time application of SMC is proposed for auto-tuning of radio stations using RTL-SDR and machine learning.•A new Hindi speech-music corpus is proposed to promote research in Indian context.•Experiments were conducted on 16 different temporal and spectral features.•Best mean classification accuracy of 93.06% was attained. |
---|---|
ISSN: | 0952-1976 |
DOI: | 10.1016/j.engappai.2024.109117 |