Loading…
Machine learning-based marker for coronary artery disease: derivation and validation in two longitudinal cohorts
Binary diagnosis of coronary artery disease does not preserve the complexity of disease or quantify its severity or its associated risk with death; hence, a quantitative marker of coronary artery disease is warranted. We evaluated a quantitative marker of coronary artery disease derived from probabi...
Saved in:
Published in: | The Lancet (British edition) 2023-01, Vol.401 (10372), p.215-225 |
---|---|
Main Authors: | , , , , , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Binary diagnosis of coronary artery disease does not preserve the complexity of disease or quantify its severity or its associated risk with death; hence, a quantitative marker of coronary artery disease is warranted. We evaluated a quantitative marker of coronary artery disease derived from probabilities of a machine learning model.
In this cohort study, we developed and validated a coronary artery disease-predictive machine learning model using 95 935 electronic health records and assessed its probabilities as in-silico scores for coronary artery disease (ISCAD; range 0 [lowest probability] to 1 [highest probability]) in participants in two longitudinal biobank cohorts. We measured the association of ISCAD with clinical outcomes—namely, coronary artery stenosis, obstructive coronary artery disease, multivessel coronary artery disease, all-cause death, and coronary artery disease sequelae.
Among 95 935 participants, 35 749 were from the BioMe Biobank (median age 61 years [IQR 18]; 14 599 [41%] were male and 21 150 [59%] were female; 5130 [14%] were with diagnosed coronary artery disease) and 60 186 were from the UK Biobank (median age 62 [15] years; 25 031 [42%] male and 35 155 [58%] female; 8128 [14%] with diagnosed coronary artery disease). The model predicted coronary artery disease with an area under the receiver operating characteristic curve of 0·95 (95% CI 0·94–0·95; sensitivity of 0·94 [0·94–0·95] and specificity of 0·82 [0·81–0·83]) and 0·93 (0·92–0·93; sensitivity of 0·90 [0·89–0·90] and specificity of 0·88 [0·87–0·88]) in the BioMe validation and holdout sets, respectively, and 0·91 (0·91–0·91; sensitivity of 0·84 [0·83–0·84] and specificity of 0·83 [0·82–0·83]) in the UK Biobank external test set. ISCAD captured coronary artery disease risk from known risk factors, pooled cohort equations, and polygenic risk scores. Coronary artery stenosis increased quantitatively with ascending ISCAD quartiles (increase per quartile of 12 percentage points), including risk of obstructive coronary artery disease, multivessel coronary artery disease, and stenosis of major coronary arteries. Hazard ratios (HRs) and prevalence of all-cause death increased stepwise over ISCAD deciles (decile 1: HR 1·0 [95% CI 1·0–1·0], 0·2% prevalence; decile 6: 11 [3·9–31], 3·1% prevalence; and decile 10: 56 [20–158], 11% prevalence). A similar trend was observed for recurrent myocardial infarction. 12 (46%) undiagnosed individuals with high ISCAD (≥0·9) had clinical evidence of |
---|---|
ISSN: | 0140-6736 1474-547X 1474-547X |
DOI: | 10.1016/S0140-6736(22)02079-7 |