Loading…

Non-negative matrix factorization-based time-frequency feature extraction of voice signal for Parkinson's disease prediction

[Display omitted] •Time-frequency representation of voice signal with non-negative matrix factorization is proposed to extract relevant features for Parkinson disease (PD) detection.•The average classification accuracies of up to 92% in vowels and 97% in words are achieved.•In the dysarthria level e...

Full description

Saved in:
Bibliographic Details
Published in:Computer speech & language 2021-09, Vol.69, p.101216, Article 101216
Main Authors: Karan, Biswajit, Sahu, Sitanshu Sekhar, Orozco-Arroyave, Juan Rafael, Mahto, Kartik
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:[Display omitted] •Time-frequency representation of voice signal with non-negative matrix factorization is proposed to extract relevant features for Parkinson disease (PD) detection.•The average classification accuracies of up to 92% in vowels and 97% in words are achieved.•In the dysarthria level evaluation, Spearman's correlations around 0.81 is achieved in sustained vowels and in isolated words.•The results indicate that the proposed approach is suitable and robust for the automatic detection of PD. Parkinson's disease (PD) is a neuron related disorder that affects the people in old age. The majority of people suffering from PD develop several voice impairments mainly related to what is known as dysarthric speech. Voice analysis can help in PD detection and in the evaluation of the dysarthria level of the patients. This study introduces time-frequency features to model discontinuities and abrupt changes that arise in the voice signal due to PD. The proposed method consists of four stages: time-frequency matrix (TFM) representation, TFM decomposition using non-negative matrix factorization (NMF), feature extraction and classification. Statistical analyses show that the proposed time-frequency features significantly differentiate between PD patients and healthy speakers. Experiments with sustained vowel phonations and isolated words of the corpus PC–GITA are conducted. The proposed method achieved average classification accuracies of up to 92% in vowels, and 97% in words. There is an improvement in accuracy ranging from 10% to 40% compared to existing methods. Further, the developed models are evaluated upon an independent dataset. Results on this separate test set show accuracies ranging from 63% to 75% in vowels, and from 53% to 75% in isolated words. Regarding the dysarthria level evaluation, Spearman's correlations between original and predicted labels are around 0.81 in sustained vowels and in isolated words. The results indicate that the proposed approach is suitable and robust for the automatic detection of PD.
ISSN:0885-2308
1095-8363
DOI:10.1016/j.csl.2021.101216