Loading…

Machine Learning Models for ASCVD Risk Prediction in an Asian Population - How to Validate the Model is Important

Introduction: Atherosclerotic cardiovascular disease (ASCVD) is prevalent worldwide including Taiwan, however widely accepted tools to assess the risk of ASCVD are lacking in Taiwan. Machine learning models are potentially useful for risk evaluation. In this study we used two cohorts to test the fea...

Full description

Saved in:
Bibliographic Details
Published in:Acta Cardiologica Sinica 2023-11, Vol.39 (6), p.901-912
Main Authors: Hsiao, Yu-Chung, Kuo, Chen-Yuan, Lin, Fang-Ju, Wu, Yen-Wen, Lin, Tsung-Hsien, Yeh, Hung-I, Chen, Jaw-Wen, Wu, Chau-Chung
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Introduction: Atherosclerotic cardiovascular disease (ASCVD) is prevalent worldwide including Taiwan, however widely accepted tools to assess the risk of ASCVD are lacking in Taiwan. Machine learning models are potentially useful for risk evaluation. In this study we used two cohorts to test the feasibility of machine learning with transfer learning for developing an ASCVD risk prediction model in Taiwan. Methods: Two multi-center observational registry cohorts, T-SPARCLE and T-PPARCLE were used in this study. The variables selected were based on European, U.S. and Asian guidelines. Both registries recorded the ASCVD outcomes of the patients. Ten-fold validation and temporal validation methods were used to evaluate the performance of the binary classification analysis [prediction of major adverse cardiovascular (CV) events in one year]. Time-to-event analyses were also performed. Results: In the binary classification analysis, eXtreme Gradient Boosting (XGBoost) and random forest had the best performance, with areas under the receiver operating characteristic curve (AUC-ROC) of 0.72 (0.68-0.76) and 0.73 (0.69-0.77), respectively, although it was not significantly better than other models. Temporal validation was also performed, and the data showed significant differences in the distribution of various features and event rate. The AUC-ROC of XGBoost dropped to 0.66 (0.59-0.73), while that of random forest dropped to 0.69 (0.62-0.76) in the temporal validation method, and the performance also became numerically worse than that of the logistic regression model. In the time-to-event analysis, most models had a concordance index of around 0.70. Conclusions: Machine learning models with appropriate transfer learning may be a useful tool for the development of CV risk prediction models and may help improve patient care in the future.
ISSN:1011-6842
DOI:10.6515/ACS.202311_39(6).20230528A