Loading…

Intrinsically Modified Physio-Biological Features Driven Heterogenous Ensemble Learning Model for Cardio-Vascular Disease Prediction

Cardiovascular disease and other non-communicable illnesses have been on the rise in recent years. Despite innovations in computer-aided diagnosis (CAD) and clinical decision systems, unlike vision-based e-healthcare practices, heart-disease prediction requires learning over the different bio-physio...

Full description

Saved in:
Bibliographic Details
Published in:Letters in high energy physics 2024-02, Vol.2024 (1)
Main Author: L. Hamsaveni et al
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Cardiovascular disease and other non-communicable illnesses have been on the rise in recent years. Despite innovations in computer-aided diagnosis (CAD) and clinical decision systems, unlike vision-based e-healthcare practices, heart-disease prediction requires learning over the different bio-physiological parameters related to the heart’s health. The limitations of the datasets including class-imbalance, redundant computation and the threat of local minima and convergence, and resulting low-accuracy confine real-time significance of the at hand cardiovascular disease prediction (CDP) systems. In this paper a robust intrinsically modified bio-physiological parameters driven heterogenous ensemble learning based CVD prediction model is proposed. We focused on both feature optimization as well as computational efficacy to achieve a robust CAD solution towards CVD diagnosis. Our proposed method applies age, gender, cholesterol, protein profiles, body mass index information, stoke profile or history, electro-cardiogram information etc. from the benchmark dataset to enable a scalable CVD prediction model. To ensure semantic feature driven learning, the aforesaid features were processed for Word2Vec embedding, which was followed by resampling by using synthetic minority over-sampling technique (SMOTE) and its variants, SMOTE-Boundary Line and SMOTE-ENN which helped to alleviate any probability of class-imbalance. Subsequently, Principal Component Analysis (PCA), Cross-Correlation Analysis (CCRA) and Significant Predictor Test (SPT) methods were applied distinctly to retain the optimal feature sets. The selected feature instances were normalized by applying Min-max Scalar Normalization method. The normalized features were taught using a mixed-method ensemble learning strategy that comprised Base Classifier (RF), Decision Tree (DT), Support Vector Machine (SVM) variations, Naïve Bayes (NB), Logistic Regression (LOGR), Linear Regression (LR), Random Forest (RF), and Extra Tree Classifier (ETC) as foundational classifiers. It used the maximum voting ensemble (MVE) method to determine if each individual was CDV-Positive or CVD-Negative. The results show that the proposed method is resilient for application in real-world CDS scenarios, as it surpasses all prior state-of-the-art approaches in terms of CVD prediction accuracy  (99.93%), precision (99.69%), recall (99.53%), and F-Measure (99.60%).
ISSN:2632-2714