Loading…
Stroke mortality prediction based on ensemble learning and the combination of structured and textual data
For severe cerebrovascular diseases such as stroke, the prediction of short-term mortality of patients has tremendous medical significance. In this study, we combined machine learning models Random Forest classifier (RF), Adaptive Boosting (AdaBoost), Extremely Randomised Trees (ExtraTree) classifie...
Saved in:
Published in: | Computers in biology and medicine 2023-03, Vol.155, p.106176-106176, Article 106176 |
---|---|
Main Authors: | , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | For severe cerebrovascular diseases such as stroke, the prediction of short-term mortality of patients has tremendous medical significance. In this study, we combined machine learning models Random Forest classifier (RF), Adaptive Boosting (AdaBoost), Extremely Randomised Trees (ExtraTree) classifier, XGBoost classifier, TabNet, and DistilBERT to construct a multi-level prediction model that used bioassay data and radiology text reports from haemorrhagic and ischaemic stroke patients to predict six-month mortality. The performances of the prediction models were measured using the area under the receiver operating characteristic curve (AUROC), the area under the precision-recall curve (AUPRC), precision, recall, and F1-score. The prediction models were built with the use of data from 19,616 haemorrhagic stroke patients and 50,178 ischaemic stroke patients. Novel six-month mortality prediction models for these patients were developed, which enhanced the performance of the prediction models by combining laboratory test data, structured data, and textual radiology report data. The achieved performances were as follows: AUROC = 0.89, AUPRC = 0.70, precision = 0.52, recall = 0.78, and F1 score = 0.63 for haemorrhagic patients, and AUROC = 0.88, AUPRC = 0.54, precision = 0.34, recall = 0.80, and F1 score = 0.48 for ischaemic patients. Such models could be used for mortality risk assessment and early identification of high-risk stroke patients. This could contribute to more efficient utilisation of healthcare resources for stroke survivors.
•6-month mortality prediction of hemorrhagic and ischemic stroke patients by using machine learning and deep learning models.•An ensemble prediction model based on the combination of structured and textual data from hospitals.•Uncover key prediction features associated with mortality in stroke patients. |
---|---|
ISSN: | 0010-4825 1879-0534 |
DOI: | 10.1016/j.compbiomed.2022.106176 |