Loading…

Negation recognition in clinical natural language processing using a combination of the NegEx algorithm and a convolutional neural network

Background Important clinical information of patients is present in unstructured free-text fields of Electronic Health Records (EHRs). While this information can be extracted using clinical Natural Language Processing (cNLP), the recognition of negation modifiers represents an important challenge. A...

Full description

Saved in:

Bibliographic Details
Published in:	BMC medical informatics and decision making 2023-10, Vol.23 (1), p.1-216, Article 216
Main Authors:	Argüello-González, Guillermo, Aquino-Esperanza, José, Salvador, Daniel, Bretón-Romero, Rosa, Del Río-Bermudez, Carlos, Tello, Jorge, Menke, Sebastian
Format:	Article
Language:	English
Subjects:	Algorithms Analysis Artificial neural networks Classifiers Clinical Natural Language Processing CNN Computational linguistics Customization Data processing Electronic health records Electronic medical records Electronic records Health aspects Language Language processing Linguistics Machine learning Methods Natural language interfaces Natural language processing Negation NegEx Neural networks Non-English languages Performance measurement Recall Recognition Unstructured data
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Background Important clinical information of patients is present in unstructured free-text fields of Electronic Health Records (EHRs). While this information can be extracted using clinical Natural Language Processing (cNLP), the recognition of negation modifiers represents an important challenge. A wide range of cNLP applications have been developed to detect the negation of medical entities in clinical free-text, however, effective solutions for languages other than English are scarce. This study aimed at developing a solution for negation recognition in Spanish EHRs based on a combination of a customized rule-based NegEx layer and a convolutional neural network (CNN). Methods Based on our previous experience in real world evidence (RWE) studies using information embedded in EHRs, negation recognition was simplified into a binary problem ('affirmative' vs. 'non-affirmative' class). For the NegEx layer, negation rules were obtained from a publicly available Spanish corpus and enriched with custom ones, whereby the CNN binary classifier was trained on EHRs annotated for clinical named entities (cNEs) and negation markers by medical doctors. Results The proposed negation recognition pipeline obtained precision, recall, and F1-score of 0.93, 0.94, and 0.94 for the 'affirmative' class, and 0.86, 0.84, and 0.85 for the 'non-affirmative' class, respectively. To validate the generalization capabilities of our methodology, we applied the negation recognition pipeline on EHRs (6,710 cNEs) from a different data source distribution than the training corpus and obtained consistent performance metrics for the 'affirmative' and 'non-affirmative' class (0.95, 0.97, and 0.96; and 0.90, 0.83, and 0.86 for precision, recall, and F1-score, respectively). Lastly, we evaluated the pipeline against two publicly available Spanish negation corpora, the IULA and NUBes, obtaining state-of-the-art metrics (1.00, 0.99, and 0.99; and 1.00, 0.93, and 0.96 for precision, recall, and F1-score, respectively). Conclusion Negation recognition is a source of low precision in the retrieval of cNEs from EHRs' free-text. Combining a customized rule-based NegEx layer with a CNN binary classifier outperformed many other current approaches. RWE studies highly benefit from the correct recognition of negation as it reduces false positive detections of cNE which otherwise would undoubtedly reduce the credibility of cNLP systems. Keywords: Negation, NegEx, CNN, Electronic health records, Clinical Nat
ISSN:	1472-6947 1472-6947
DOI:	10.1186/s12911-023-02301-5