Loading…

Machine learning classification analysis for a hypertensive population as a function of several risk factors

•Best AUC 0.73 (95% CI [0.70–0.76]) showing fair result with the final diagnosis.•This model predicts correctly hypertensive individuals 73% better than a randomly selected individual.•According to the model Kidney disease and smoking habits do not affect odds of the outcome.•According to the model...

Full description

Saved in:
Bibliographic Details
Published in:Expert systems with applications 2018-11, Vol.110, p.206-215
Main Authors: López-Martínez, Fernando, Schwarcz.MD, Aron, Núñez-Valdez, Edward Rolando, García-Díaz, Vicente
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•Best AUC 0.73 (95% CI [0.70–0.76]) showing fair result with the final diagnosis.•This model predicts correctly hypertensive individuals 73% better than a randomly selected individual.•According to the model Kidney disease and smoking habits do not affect odds of the outcome.•According to the model odds of having hypertension is higher for female individuals than for male.•Non-Hispanic black have higher odds of having hypertension than the rest of ethnics groups. This research presents a prediction model to evaluate the association between gender, race, BMI, age, smoking, kidney disease and diabetes using logistic regression. Data collected from NHANES datasets from 2007 to 2016. An unbalanced sampling dataset of 19.709 with (83%) non-hypertensive individuals and (17%) hypertensive individuals. Some risk factors were categorized, and indicator variables were created to transform the continuous variables to a binary form to have consistent predictors with the outcome. The results show a sensitivity of 77%, a specificity of 68%, precision on the positive predicted value of 32% in the test sample and a calculated AUC of 73% (95% CI[0.70–0.76]). The model also confirms that individuals with obesity, age range between 71 and 80 years old, race non-Hispanic black and male have higher odds of having hypertension. Diabetes, kidney disease and smoking habits do not affect odds of the outcome. In clinical practice, this model can be used to inform patients and guide population health management in detecting patients with high probability of developing a cardiovascular disease. The proposed logistic regression method can be used as an expert system’s inference engine to support the experts in the cardiovascular disease field to provide problem analysis for patients in risk of developing hypertension.
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2018.06.006