Loading…

Evaluation of an Explainable Tree-Based AI Model for Thrombophilia Diagnosis and Thrombosis Risk Stratification

Background: Thrombophilia diagnosis can often be a convoluted process involving collection and analysis of clinical data, specialized laboratory testing, and high-level decision-making. This is inherently subjective due to differences in the clinical practice philosophy of each individual practition...

Full description

Saved in:
Bibliographic Details
Published in:Blood 2023-11, Vol.142 (Supplement 1), p.2300-2300
Main Authors: McRae, Hannah L, Kahl, Fabian, Kapsecker, Maximilian, Rühl, Heiko, Jonas, Stephan M, Pötzsch, Bernd
Format: Article
Language:English
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Background: Thrombophilia diagnosis can often be a convoluted process involving collection and analysis of clinical data, specialized laboratory testing, and high-level decision-making. This is inherently subjective due to differences in the clinical practice philosophy of each individual practitioner and can vary depending on institutional guidelines and available resources. Patient care and clinical outcomes may be affected as a result, which in turn provides a potential opportunity for optimization of thrombophilia diagnosis using AI. Methods: This retrospective study evaluated the utility and effectiveness of an AI-powered algorithm (XGBoost) programmed to replicate the process of thrombophilia diagnosis. A total of 256 patients were referred by their clinician for thrombophilia evaluation at our ambulatory coagulation clinic between November 2019 and February 2023 and clinical and laboratory data were collected from the electronic medical record. Thrombophilia diagnosis was established (or ruled out) on the basis of the patients' personal and family history of thrombosis as well as according to the results of thrombophilia testing including established acquired and inherited thrombophilia risk factors. The XGBoost, a gradient boosting algorithm for supervised learning, was used to perform a randomized search over a predefined set of tree parameters and to find a well-performing configuration on the data using cross-validation. The dataset contained 12 clinical data parameters and 26 laboratory data parameters. The target variable was two-dimensional, consisting of the thrombophilia probability score and thrombophilia risk factors. Thrombophilia probability scores were calculated based on clinical data from the following criteria: one point assigned each for a) spontaneous thrombotic event; b) mild risk situation; c) recurrent thrombosis; d) atypical thrombosis localization; e) age
ISSN:0006-4971
1528-0020
DOI:10.1182/blood-2023-190920