Loading…

The development of a novel natural language processing tool to identify pediatric chest radiograph reports with pneumonia

Chest radiographs are frequently used to diagnose community-acquired pneumonia (CAP) for children in the acute care setting. Natural language processing (NLP)-based tools may be incorporated into the electronic health record and combined with other clinical data to develop meaningful clinical decisi...

Full description

Saved in:
Bibliographic Details
Published in:Frontiers in digital health 2023-02, Vol.5, p.1104604-1104604
Main Authors: Rixe, Nancy, Frisch, Adam, Wang, Zhendong, Martin, Judith M, Suresh, Srinivasan, Florin, Todd A, Ramgopal, Sriram
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c468t-58952ae55991189a3420232b8d94d3a227c9facb835c645ec07b505785c620c3
cites cdi_FETCH-LOGICAL-c468t-58952ae55991189a3420232b8d94d3a227c9facb835c645ec07b505785c620c3
container_end_page 1104604
container_issue
container_start_page 1104604
container_title Frontiers in digital health
container_volume 5
creator Rixe, Nancy
Frisch, Adam
Wang, Zhendong
Martin, Judith M
Suresh, Srinivasan
Florin, Todd A
Ramgopal, Sriram
description Chest radiographs are frequently used to diagnose community-acquired pneumonia (CAP) for children in the acute care setting. Natural language processing (NLP)-based tools may be incorporated into the electronic health record and combined with other clinical data to develop meaningful clinical decision support tools for this common pediatric infection. We sought to develop and internally validate NLP algorithms to identify pediatric chest radiograph (CXR) reports with pneumonia. We performed a retrospective study of encounters for patients from six pediatric hospitals over a 3-year period. We utilized six NLP techniques: word embedding, support vector machines, extreme gradient boosting (XGBoost), light gradient boosting machines Naïve Bayes and logistic regression. We evaluated their performance of each model from a validation sample of 1,350 chest radiographs developed as a stratified random sample of 35% admitted and 65% discharged patients when both using expert consensus and diagnosis codes. Of 172,662 encounters in the derivation sample, 15.6% had a discharge diagnosis of pneumonia in a primary or secondary position. The median patient age in the derivation sample was 3.7 years (interquartile range, 1.4-9.5 years). In the validation sample, 185/1350 (13.8%) and 205/1350 (15.3%) were classified as pneumonia by content experts and by diagnosis codes, respectively. Compared to content experts, Naïve Bayes had the highest sensitivity (93.5%) and XGBoost had the highest F1 score (72.4). Compared to a diagnosis code of pneumonia, the highest sensitivity was again with the Naïve Bayes (80.1%), and the highest F1 score was with the support vector machine (53.0%). NLP algorithms can accurately identify pediatric pneumonia from radiography reports. Following external validation and implementation into the electronic health record, these algorithms can facilitate clinical decision support and inform large database research.
doi_str_mv 10.3389/fdgth.2023.1104604
format article
fullrecord <record><control><sourceid>proquest_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_58671e925d954a5893df9035bbf532c9</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><doaj_id>oai_doaj_org_article_58671e925d954a5893df9035bbf532c9</doaj_id><sourcerecordid>2786513432</sourcerecordid><originalsourceid>FETCH-LOGICAL-c468t-58952ae55991189a3420232b8d94d3a227c9facb835c645ec07b505785c620c3</originalsourceid><addsrcrecordid>eNpVUk1v3CAQtaJWSZTmD_QQcexlt3wY21wqVVE_IkXqZQ-9oTGMbSLbOIBT7b8Pm91GyQWYYd4b3mOK4jOjWyEa9bWzfRq2nHKxZYyWFS3Pikte1WLDpfj74c35oriO8YFSyiXjnMrz4kJUilFZ08tivxuQWHzC0S8Tzon4jgCZfU6QGdIaYCQjzP0KPZIleIMxurknyfsxL8TZDHLdnixoHaTgDDEDxkQCWOf7AMtAAi4-pEj-uTSQZcZ18rODT8XHDsaI16f9qtj9_LG7_b25__Pr7vb7_caUVZM2slGSA0qpFGONAlEeNPO2saq0AjivjerAtI2QpiolGlq3MmtrcsipEVfF3ZHWenjQS3AThL324PRLwodeQ0jOjKhlU9UMFZdWyRJyY2E7RYVs204KblTm-nbkWtZ2Qmuy9OzPO9L3N7MbdO-ftFIqO08zwZcTQfCPa7ZJTy4aHLPD6Neoed1UkolS8FzKj6Um-BgDdq9tGNWHCdAvE6APbujTBGTQzdsHvkL-_7d4Bnl1r3s</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2786513432</pqid></control><display><type>article</type><title>The development of a novel natural language processing tool to identify pediatric chest radiograph reports with pneumonia</title><source>PubMed (Medline)</source><creator>Rixe, Nancy ; Frisch, Adam ; Wang, Zhendong ; Martin, Judith M ; Suresh, Srinivasan ; Florin, Todd A ; Ramgopal, Sriram</creator><creatorcontrib>Rixe, Nancy ; Frisch, Adam ; Wang, Zhendong ; Martin, Judith M ; Suresh, Srinivasan ; Florin, Todd A ; Ramgopal, Sriram</creatorcontrib><description>Chest radiographs are frequently used to diagnose community-acquired pneumonia (CAP) for children in the acute care setting. Natural language processing (NLP)-based tools may be incorporated into the electronic health record and combined with other clinical data to develop meaningful clinical decision support tools for this common pediatric infection. We sought to develop and internally validate NLP algorithms to identify pediatric chest radiograph (CXR) reports with pneumonia. We performed a retrospective study of encounters for patients from six pediatric hospitals over a 3-year period. We utilized six NLP techniques: word embedding, support vector machines, extreme gradient boosting (XGBoost), light gradient boosting machines Naïve Bayes and logistic regression. We evaluated their performance of each model from a validation sample of 1,350 chest radiographs developed as a stratified random sample of 35% admitted and 65% discharged patients when both using expert consensus and diagnosis codes. Of 172,662 encounters in the derivation sample, 15.6% had a discharge diagnosis of pneumonia in a primary or secondary position. The median patient age in the derivation sample was 3.7 years (interquartile range, 1.4-9.5 years). In the validation sample, 185/1350 (13.8%) and 205/1350 (15.3%) were classified as pneumonia by content experts and by diagnosis codes, respectively. Compared to content experts, Naïve Bayes had the highest sensitivity (93.5%) and XGBoost had the highest F1 score (72.4). Compared to a diagnosis code of pneumonia, the highest sensitivity was again with the Naïve Bayes (80.1%), and the highest F1 score was with the support vector machine (53.0%). NLP algorithms can accurately identify pediatric pneumonia from radiography reports. Following external validation and implementation into the electronic health record, these algorithms can facilitate clinical decision support and inform large database research.</description><identifier>ISSN: 2673-253X</identifier><identifier>EISSN: 2673-253X</identifier><identifier>DOI: 10.3389/fdgth.2023.1104604</identifier><identifier>PMID: 36910570</identifier><language>eng</language><publisher>Switzerland: Frontiers Media S.A</publisher><subject>chest radiograph ; clinical decision support ; Digital Health ; machine learning ; natural language processing ; pediatric ; pneumonia</subject><ispartof>Frontiers in digital health, 2023-02, Vol.5, p.1104604-1104604</ispartof><rights>2023 Rixe, Frisch, Wang, Martin, Suresh, Florin and Ramgopal.</rights><rights>2023 Rixe, Frisch, Wang, Martin, Suresh, Florin and Ramgopal. 2023 Rixe, Frisch, Wang, Martin, Suresh, Florin and Ramgopal</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c468t-58952ae55991189a3420232b8d94d3a227c9facb835c645ec07b505785c620c3</citedby><cites>FETCH-LOGICAL-c468t-58952ae55991189a3420232b8d94d3a227c9facb835c645ec07b505785c620c3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC9992200/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC9992200/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,724,777,781,882,27905,27906,53772,53774</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/36910570$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Rixe, Nancy</creatorcontrib><creatorcontrib>Frisch, Adam</creatorcontrib><creatorcontrib>Wang, Zhendong</creatorcontrib><creatorcontrib>Martin, Judith M</creatorcontrib><creatorcontrib>Suresh, Srinivasan</creatorcontrib><creatorcontrib>Florin, Todd A</creatorcontrib><creatorcontrib>Ramgopal, Sriram</creatorcontrib><title>The development of a novel natural language processing tool to identify pediatric chest radiograph reports with pneumonia</title><title>Frontiers in digital health</title><addtitle>Front Digit Health</addtitle><description>Chest radiographs are frequently used to diagnose community-acquired pneumonia (CAP) for children in the acute care setting. Natural language processing (NLP)-based tools may be incorporated into the electronic health record and combined with other clinical data to develop meaningful clinical decision support tools for this common pediatric infection. We sought to develop and internally validate NLP algorithms to identify pediatric chest radiograph (CXR) reports with pneumonia. We performed a retrospective study of encounters for patients from six pediatric hospitals over a 3-year period. We utilized six NLP techniques: word embedding, support vector machines, extreme gradient boosting (XGBoost), light gradient boosting machines Naïve Bayes and logistic regression. We evaluated their performance of each model from a validation sample of 1,350 chest radiographs developed as a stratified random sample of 35% admitted and 65% discharged patients when both using expert consensus and diagnosis codes. Of 172,662 encounters in the derivation sample, 15.6% had a discharge diagnosis of pneumonia in a primary or secondary position. The median patient age in the derivation sample was 3.7 years (interquartile range, 1.4-9.5 years). In the validation sample, 185/1350 (13.8%) and 205/1350 (15.3%) were classified as pneumonia by content experts and by diagnosis codes, respectively. Compared to content experts, Naïve Bayes had the highest sensitivity (93.5%) and XGBoost had the highest F1 score (72.4). Compared to a diagnosis code of pneumonia, the highest sensitivity was again with the Naïve Bayes (80.1%), and the highest F1 score was with the support vector machine (53.0%). NLP algorithms can accurately identify pediatric pneumonia from radiography reports. Following external validation and implementation into the electronic health record, these algorithms can facilitate clinical decision support and inform large database research.</description><subject>chest radiograph</subject><subject>clinical decision support</subject><subject>Digital Health</subject><subject>machine learning</subject><subject>natural language processing</subject><subject>pediatric</subject><subject>pneumonia</subject><issn>2673-253X</issn><issn>2673-253X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>DOA</sourceid><recordid>eNpVUk1v3CAQtaJWSZTmD_QQcexlt3wY21wqVVE_IkXqZQ-9oTGMbSLbOIBT7b8Pm91GyQWYYd4b3mOK4jOjWyEa9bWzfRq2nHKxZYyWFS3Pikte1WLDpfj74c35oriO8YFSyiXjnMrz4kJUilFZ08tivxuQWHzC0S8Tzon4jgCZfU6QGdIaYCQjzP0KPZIleIMxurknyfsxL8TZDHLdnixoHaTgDDEDxkQCWOf7AMtAAi4-pEj-uTSQZcZ18rODT8XHDsaI16f9qtj9_LG7_b25__Pr7vb7_caUVZM2slGSA0qpFGONAlEeNPO2saq0AjivjerAtI2QpiolGlq3MmtrcsipEVfF3ZHWenjQS3AThL324PRLwodeQ0jOjKhlU9UMFZdWyRJyY2E7RYVs204KblTm-nbkWtZ2Qmuy9OzPO9L3N7MbdO-ftFIqO08zwZcTQfCPa7ZJTy4aHLPD6Neoed1UkolS8FzKj6Um-BgDdq9tGNWHCdAvE6APbujTBGTQzdsHvkL-_7d4Bnl1r3s</recordid><startdate>20230222</startdate><enddate>20230222</enddate><creator>Rixe, Nancy</creator><creator>Frisch, Adam</creator><creator>Wang, Zhendong</creator><creator>Martin, Judith M</creator><creator>Suresh, Srinivasan</creator><creator>Florin, Todd A</creator><creator>Ramgopal, Sriram</creator><general>Frontiers Media S.A</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope></search><sort><creationdate>20230222</creationdate><title>The development of a novel natural language processing tool to identify pediatric chest radiograph reports with pneumonia</title><author>Rixe, Nancy ; Frisch, Adam ; Wang, Zhendong ; Martin, Judith M ; Suresh, Srinivasan ; Florin, Todd A ; Ramgopal, Sriram</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c468t-58952ae55991189a3420232b8d94d3a227c9facb835c645ec07b505785c620c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>chest radiograph</topic><topic>clinical decision support</topic><topic>Digital Health</topic><topic>machine learning</topic><topic>natural language processing</topic><topic>pediatric</topic><topic>pneumonia</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Rixe, Nancy</creatorcontrib><creatorcontrib>Frisch, Adam</creatorcontrib><creatorcontrib>Wang, Zhendong</creatorcontrib><creatorcontrib>Martin, Judith M</creatorcontrib><creatorcontrib>Suresh, Srinivasan</creatorcontrib><creatorcontrib>Florin, Todd A</creatorcontrib><creatorcontrib>Ramgopal, Sriram</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>Directory of Open Access Journals</collection><jtitle>Frontiers in digital health</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Rixe, Nancy</au><au>Frisch, Adam</au><au>Wang, Zhendong</au><au>Martin, Judith M</au><au>Suresh, Srinivasan</au><au>Florin, Todd A</au><au>Ramgopal, Sriram</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>The development of a novel natural language processing tool to identify pediatric chest radiograph reports with pneumonia</atitle><jtitle>Frontiers in digital health</jtitle><addtitle>Front Digit Health</addtitle><date>2023-02-22</date><risdate>2023</risdate><volume>5</volume><spage>1104604</spage><epage>1104604</epage><pages>1104604-1104604</pages><issn>2673-253X</issn><eissn>2673-253X</eissn><abstract>Chest radiographs are frequently used to diagnose community-acquired pneumonia (CAP) for children in the acute care setting. Natural language processing (NLP)-based tools may be incorporated into the electronic health record and combined with other clinical data to develop meaningful clinical decision support tools for this common pediatric infection. We sought to develop and internally validate NLP algorithms to identify pediatric chest radiograph (CXR) reports with pneumonia. We performed a retrospective study of encounters for patients from six pediatric hospitals over a 3-year period. We utilized six NLP techniques: word embedding, support vector machines, extreme gradient boosting (XGBoost), light gradient boosting machines Naïve Bayes and logistic regression. We evaluated their performance of each model from a validation sample of 1,350 chest radiographs developed as a stratified random sample of 35% admitted and 65% discharged patients when both using expert consensus and diagnosis codes. Of 172,662 encounters in the derivation sample, 15.6% had a discharge diagnosis of pneumonia in a primary or secondary position. The median patient age in the derivation sample was 3.7 years (interquartile range, 1.4-9.5 years). In the validation sample, 185/1350 (13.8%) and 205/1350 (15.3%) were classified as pneumonia by content experts and by diagnosis codes, respectively. Compared to content experts, Naïve Bayes had the highest sensitivity (93.5%) and XGBoost had the highest F1 score (72.4). Compared to a diagnosis code of pneumonia, the highest sensitivity was again with the Naïve Bayes (80.1%), and the highest F1 score was with the support vector machine (53.0%). NLP algorithms can accurately identify pediatric pneumonia from radiography reports. Following external validation and implementation into the electronic health record, these algorithms can facilitate clinical decision support and inform large database research.</abstract><cop>Switzerland</cop><pub>Frontiers Media S.A</pub><pmid>36910570</pmid><doi>10.3389/fdgth.2023.1104604</doi><tpages>1</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2673-253X
ispartof Frontiers in digital health, 2023-02, Vol.5, p.1104604-1104604
issn 2673-253X
2673-253X
language eng
recordid cdi_doaj_primary_oai_doaj_org_article_58671e925d954a5893df9035bbf532c9
source PubMed (Medline)
subjects chest radiograph
clinical decision support
Digital Health
machine learning
natural language processing
pediatric
pneumonia
title The development of a novel natural language processing tool to identify pediatric chest radiograph reports with pneumonia
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-19T05%3A56%3A56IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=The%20development%20of%20a%20novel%20natural%20language%20processing%20tool%20to%20identify%20pediatric%20chest%20radiograph%20reports%20with%20pneumonia&rft.jtitle=Frontiers%20in%20digital%20health&rft.au=Rixe,%20Nancy&rft.date=2023-02-22&rft.volume=5&rft.spage=1104604&rft.epage=1104604&rft.pages=1104604-1104604&rft.issn=2673-253X&rft.eissn=2673-253X&rft_id=info:doi/10.3389/fdgth.2023.1104604&rft_dat=%3Cproquest_doaj_%3E2786513432%3C/proquest_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c468t-58952ae55991189a3420232b8d94d3a227c9facb835c645ec07b505785c620c3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2786513432&rft_id=info:pmid/36910570&rfr_iscdi=true