Loading…

Prediction of intra-abdominal injury using natural language processing of electronic medical record data

This study aimed to use natural language processing to predict the presence of intra-abdominal injury using unstructured data from electronic medical records. This was a random-sample retrospective observational cohort study leveraging unstructured data from injured patients taken to one of 9 acute...

Full description

Saved in:

Bibliographic Details
Published in:	Surgery 2024-09, Vol.176 (3), p.577-585
Main Authors:	Danna, Giovanna, Garg, Ravi, Buchheit, Joanna, Patel, Radha, Zhan, Tiannan, Ellyn, Alexander, Maqbool, Farhan, Yala, Linda, Moklyak, Yuriy, Frydman, James, Kho, Abel, Kong, Nan, Furmanchuk, Alona, Lundberg, Alexander, Stey, Anne M.
Format:	Article
Language:	English
Subjects:	Abdominal Injuries - diagnosis Abdominal Injuries - epidemiology Adult Aged Electronic Health Records - statistics & numerical data Female Humans Male Middle Aged Natural Language Processing Retrospective Studies Young Adult
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	This study aimed to use natural language processing to predict the presence of intra-abdominal injury using unstructured data from electronic medical records. This was a random-sample retrospective observational cohort study leveraging unstructured data from injured patients taken to one of 9 acute care hospitals in an integrated health system between 2015 and 2021. Patients with International Classification of Diseases External Cause of Morbidity codes were identified. History and physical, consult, progress, and radiology report text from the first 8 hours of care were abstracted. Annotator dyads independently annotated encounters’ text files to establish ground truth regarding whether intra-abdominal injury occurred. Features were extracted from text using natural language processing techniques, bag of words, and principal component analysis. We tested logistic regression, random forests, and gradient boosting machine to determine accuracy, recall, and precision of natural language processing to predict intra-abdominal injury. A random sample of 7,000 patient encounters of 177,127 was annotated. Only 2,951 had sufficient information to determine whether an intra-abdominal injury was present. Among those, 84 (2.9%) had an intra-abdominal injury. The concordance between annotators was 0.989. Logistic regression of features identified with bag of words and principal component analysis had the best predictive ability, with an area under the receiver operating characteristic curve of 0.9, recall of 0.73, and precision of 0.17. Text features with greatest importance included “abdomen,” “pelvis,” “spleen,” and “hematoma.” Natural language processing could be a screening decision support tool, which, if paired with human clinical assessment, can maximize precision of intra-abdominal injury identification.
ISSN:	0039-6060 1532-7361 1532-7361
DOI:	10.1016/j.surg.2024.05.042