Loading…
Developing a machine learning model to detect diagnostic uncertainty in clinical documentation
Diagnostic uncertainty, when unrecognized or poorly communicated, can result in diagnostic error. However, diagnostic uncertainty is challenging to study due to a lack of validated identification methods. This study aims to identify distinct linguistic patterns associated with diagnostic uncertainty...
Saved in:
Published in: | Journal of hospital medicine 2023-05, Vol.18 (5), p.405-412 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Diagnostic uncertainty, when unrecognized or poorly communicated, can result in diagnostic error. However, diagnostic uncertainty is challenging to study due to a lack of validated identification methods. This study aims to identify distinct linguistic patterns associated with diagnostic uncertainty in clinical documentation.
This case-control study compares the clinical documentation of hospitalized children who received a novel uncertain diagnosis (UD) diagnosis label during their admission to a set of matched controls. Linguistic analyses identified potential linguistic indicators (i.e., words or phrases) of diagnostic uncertainty that were then manually reviewed by a linguist and clinical experts to identify those most relevant to diagnostic uncertainty. A natural language processing program categorized medical terminology into semantic types (i.e., sign or symptom), from which we identified a subset of these semantic types that both categorized reliably and were relevant to diagnostic uncertainty. Finally, a competitive machine learning modeling strategy utilizing the linguistic indicators and semantic types compared different predictive models for identifying diagnostic uncertainty.
Our cohort included 242 UD-labeled patients and 932 matched controls with a combination of 3070 clinical notes. The best-performing model was a random forest, utilizing a combination of linguistic indicators and semantic types, yielding a sensitivity of 89.4% and a positive predictive value of 96.7%.
Expert labeling, natural language processing, and machine learning methods combined with human validation resulted in highly predictive models to detect diagnostic uncertainty in clinical documentation and represent a promising approach to detecting, studying, and ultimately mitigating diagnostic uncertainty in clinical practice. |
---|---|
ISSN: | 1553-5592 1553-5606 |
DOI: | 10.1002/jhm.13080 |