Loading…
Feature selection using logistic regression in case–control DNA methylation data of Parkinson's disease: A comparative study
•Feature selection from DNA Methylation data for Parkinson's disease.•Feature reduction using logistic regression and random forest.•Prediction of disease condition using classifier based on identified features.•Uniquely identified features using logistic regression were involved to PD. Parkins...
Saved in:
Published in: | Journal of theoretical biology 2018-11, Vol.457, p.14-18 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | •Feature selection from DNA Methylation data for Parkinson's disease.•Feature reduction using logistic regression and random forest.•Prediction of disease condition using classifier based on identified features.•Uniquely identified features using logistic regression were involved to PD.
Parkinson's disease (PD) is described as a progressive neurological disorder caused by the degeneration of dopaminergic neurons in substantia nigra pars compacta. The pathogenesis of the disease is not fully understood but it has been linked with complex genetic, epigenetic and environmental interactions. A substantial number of studies have shown the role of epigenetic modifications in support of the progression of PD. In the present study, we have analyzed the data containing methylation patterns of 1726 transcripts captured over from 66 samples of 450k, which includes 43 controls and 23 diseased samples. We used Logistic Regression (LR) for feature reduction and build a classifier with an improved accuracy rate than all features together. The performance of the classifier was compared with other feature reduction approaches viz. Random Forest (RF) and Principal Component Analysis (PCA). Feature reduction with LR and RF performed better than PCA. Some of the features corresponding to the genes such as COMT, DCTN1 and PRNP were uniquely identified by LR and are reported to play a significant role in PD. |
---|---|
ISSN: | 0022-5193 1095-8541 |
DOI: | 10.1016/j.jtbi.2018.08.018 |