Loading…

Chi2-MI: A hybrid feature selection based machine learning approach in diagnosis of chronic kidney disease

•Development of an intelligent diagnosis system to detect chronic kidney disease.•A hybrid wrapper feature selection method (Chi2-MI) has been proposed and applied.•Data pre-processing methods are adopted to prepare the dataset for the model.•Most impactful features based on correlation scores are s...

Full description

Saved in:
Bibliographic Details
Published in:Intelligent systems with applications 2022-11, Vol.16, p.200144, Article 200144
Main Authors: Dey, Samrat Kumar, Uddin, Khandaker Mohammad Mohi, Babu, Hafiz Md. Hasan, Rahman, Md. Mahbubur, Howlader, Arpita, Uddin, K.M. Aslam
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•Development of an intelligent diagnosis system to detect chronic kidney disease.•A hybrid wrapper feature selection method (Chi2-MI) has been proposed and applied.•Data pre-processing methods are adopted to prepare the dataset for the model.•Most impactful features based on correlation scores are selected to predict the CKD.•Extra tress classifier can diagnose CKD with 98% accuracy among 14 learning models. Early detection and characterization are considered crucial in treating and controlling the chronic renal disease. Because of the rising number of patients, the high risk of progression to end-stage renal disease, and the poor prognosis of morbidity and mortality, chronic kidney disease (CKD) is a significant burden on the healthcare system. Detecting CKD in its early stages is critical for saving millions of lives. The uniqueness of this study lies in developing a diagnosis system to detect chronic kidney disease using different Machine Learning (ML) algorithms with the support of a hybrid feature selection approach. This study exploited the 400 clinical data of CKD patients based on the dataset supplied by the University of California Irvine (UCI) available at their Machine Learning repository. Different data preparation techniques like encoding categorical features, missing values imputation, removing outlier factors, handling data imbalance, scaling data at the same level, and selecting relevant features are adopted to prepare the dataset for the prediction model. A hybrid Chi-squared test (Chi2) and Mutual Information (MI) based feature selection approach is proposed to remove redundant features, and a Pearson correlation matrix is also computed to consider the top important features for the prediction. Lastly, the Extra tress classifier can diagnose CKD with 98% accuracy and a 2% true negative rate without data leakage out of 14 machine learning models. On the other hand, the Bagging classifier performed worst with only 60% accuracy.
ISSN:2667-3053
2667-3053
DOI:10.1016/j.iswa.2022.200144