Loading…
Non-negative matrix factorization-based time-frequency feature extraction of voice signal for Parkinson's disease prediction
[Display omitted] •Time-frequency representation of voice signal with non-negative matrix factorization is proposed to extract relevant features for Parkinson disease (PD) detection.•The average classification accuracies of up to 92% in vowels and 97% in words are achieved.•In the dysarthria level e...
Saved in:
Published in: | Computer speech & language 2021-09, Vol.69, p.101216, Article 101216 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | [Display omitted]
•Time-frequency representation of voice signal with non-negative matrix factorization is proposed to extract relevant features for Parkinson disease (PD) detection.•The average classification accuracies of up to 92% in vowels and 97% in words are achieved.•In the dysarthria level evaluation, Spearman's correlations around 0.81 is achieved in sustained vowels and in isolated words.•The results indicate that the proposed approach is suitable and robust for the automatic detection of PD.
Parkinson's disease (PD) is a neuron related disorder that affects the people in old age. The majority of people suffering from PD develop several voice impairments mainly related to what is known as dysarthric speech. Voice analysis can help in PD detection and in the evaluation of the dysarthria level of the patients. This study introduces time-frequency features to model discontinuities and abrupt changes that arise in the voice signal due to PD. The proposed method consists of four stages: time-frequency matrix (TFM) representation, TFM decomposition using non-negative matrix factorization (NMF), feature extraction and classification. Statistical analyses show that the proposed time-frequency features significantly differentiate between PD patients and healthy speakers. Experiments with sustained vowel phonations and isolated words of the corpus PC–GITA are conducted. The proposed method achieved average classification accuracies of up to 92% in vowels, and 97% in words. There is an improvement in accuracy ranging from 10% to 40% compared to existing methods. Further, the developed models are evaluated upon an independent dataset. Results on this separate test set show accuracies ranging from 63% to 75% in vowels, and from 53% to 75% in isolated words. Regarding the dysarthria level evaluation, Spearman's correlations between original and predicted labels are around 0.81 in sustained vowels and in isolated words. The results indicate that the proposed approach is suitable and robust for the automatic detection of PD. |
---|---|
ISSN: | 0885-2308 1095-8363 |
DOI: | 10.1016/j.csl.2021.101216 |