Loading…

Post-Structuring Radiology Reports of Breast Cancer Patients for Clinical Quality Assurance

Hospitals often set protocols based on well defined standards to maintain the quality of patient reports. To ensure that the clinicians conform to the protocols, quality assurance of these reports is needed. Patient reports are currently written in free-text format, which complicates the task of qua...

Full description

Saved in:
Bibliographic Details
Published in:IEEE/ACM transactions on computational biology and bioinformatics 2020-11, Vol.17 (6), p.1883-1894
Main Authors: Pathak, Shreyasi, van Rossen, Jorit, Vijlbrief, Onno, Geerdink, Jeroen, Seifert, Christin, van Keulen, Maurice
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c392t-d8ce2eeb8cba215d977949a8ee47c94d5c701609d593de91a2b93800acdc1813
cites cdi_FETCH-LOGICAL-c392t-d8ce2eeb8cba215d977949a8ee47c94d5c701609d593de91a2b93800acdc1813
container_end_page 1894
container_issue 6
container_start_page 1883
container_title IEEE/ACM transactions on computational biology and bioinformatics
container_volume 17
creator Pathak, Shreyasi
van Rossen, Jorit
Vijlbrief, Onno
Geerdink, Jeroen
Seifert, Christin
van Keulen, Maurice
description Hospitals often set protocols based on well defined standards to maintain the quality of patient reports. To ensure that the clinicians conform to the protocols, quality assurance of these reports is needed. Patient reports are currently written in free-text format, which complicates the task of quality assurance. In this paper, we present a machine learning based natural language processing system for automatic quality assurance of radiology reports on breast cancer. This is achieved in three steps: we i) identify the top-level structure (headings) of the report, ii) classify the report content into the top-level headings, and iii) convert the free-text detailed findings in the report to a semi-structured format (post-structuring). Top level structure and content of report were predicted with an F1 score of 0.97 and 0.94, respectively, using Support Vector Machine (SVM) classifiers. For automatic structuring, our proposed hierarchical Conditional Random Field (CRF) outperformed the baseline CRF with an F1 score of 0.78 versus 0.71. The determined structure of the report is represented in semi-structured XML format of the free-text report, which helps to easily visualize the conformance of the findings to the protocols. This format also allows easy extraction of specific information for other purposes such as search, evaluation, and research.
doi_str_mv 10.1109/TCBB.2019.2914678
format article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_proquest_miscellaneous_2232133779</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8705380</ieee_id><sourcerecordid>2468772008</sourcerecordid><originalsourceid>FETCH-LOGICAL-c392t-d8ce2eeb8cba215d977949a8ee47c94d5c701609d593de91a2b93800acdc1813</originalsourceid><addsrcrecordid>eNpdkE2LFDEQhoMo7of-ABEk4MVLj_nspI47ja7Cgus6Nw8hk65ZsvR0xiR9mH9vNzPuwVMV1FMvVQ8h7zhbcc7g86Zbr1eCcVgJ4Ko19gW55FqbBqBVL5de6UZDKy_IVSlPjAkFTL0mF5IzDUrLS_L7PpXa_Kp5CnXKcXykD76PaUiPR_qAh5RroWlH1xl9qbTzY8BM732NOM6TXcq0G-IYgx_oz8kPsR7pTSlTXsA35NXODwXfnus12Xz9sum-NXc_br93N3dNkCBq09uAAnFrw9YLrnswBhR4i6hMANXrYBhvGfQaZI_AvdiCtIz50Aduubwmn06xh5z-TFiq28cScBj8iGkqTggpuJRz6ox-_A99SlMe5-OcUK01RjBmZ4qfqJBTKRl37pDj3uej48wt4t0i3i3i3Vn8vPPhnDxt99g_b_wzPQPvT0BExOexNUzPv8i_fVqG7w</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2468772008</pqid></control><display><type>article</type><title>Post-Structuring Radiology Reports of Breast Cancer Patients for Clinical Quality Assurance</title><source>IEEE Electronic Library (IEL) Journals</source><source>Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list)</source><creator>Pathak, Shreyasi ; van Rossen, Jorit ; Vijlbrief, Onno ; Geerdink, Jeroen ; Seifert, Christin ; van Keulen, Maurice</creator><creatorcontrib>Pathak, Shreyasi ; van Rossen, Jorit ; Vijlbrief, Onno ; Geerdink, Jeroen ; Seifert, Christin ; van Keulen, Maurice</creatorcontrib><description>Hospitals often set protocols based on well defined standards to maintain the quality of patient reports. To ensure that the clinicians conform to the protocols, quality assurance of these reports is needed. Patient reports are currently written in free-text format, which complicates the task of quality assurance. In this paper, we present a machine learning based natural language processing system for automatic quality assurance of radiology reports on breast cancer. This is achieved in three steps: we i) identify the top-level structure (headings) of the report, ii) classify the report content into the top-level headings, and iii) convert the free-text detailed findings in the report to a semi-structured format (post-structuring). Top level structure and content of report were predicted with an F1 score of 0.97 and 0.94, respectively, using Support Vector Machine (SVM) classifiers. For automatic structuring, our proposed hierarchical Conditional Random Field (CRF) outperformed the baseline CRF with an F1 score of 0.78 versus 0.71. The determined structure of the report is represented in semi-structured XML format of the free-text report, which helps to easily visualize the conformance of the findings to the protocols. This format also allows easy extraction of specific information for other purposes such as search, evaluation, and research.</description><identifier>ISSN: 1545-5963</identifier><identifier>EISSN: 1557-9964</identifier><identifier>DOI: 10.1109/TCBB.2019.2914678</identifier><identifier>PMID: 31059453</identifier><identifier>CODEN: ITCBCY</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>automatic structuring ; Breast cancer ; conditional random field ; Conditional random fields ; Format ; Learning algorithms ; Machine learning ; Natural language processing ; Patients ; post-structuring ; Quality assurance ; Quality control ; Radiology ; radiology reports ; Support vector machines</subject><ispartof>IEEE/ACM transactions on computational biology and bioinformatics, 2020-11, Vol.17 (6), p.1883-1894</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c392t-d8ce2eeb8cba215d977949a8ee47c94d5c701609d593de91a2b93800acdc1813</citedby><cites>FETCH-LOGICAL-c392t-d8ce2eeb8cba215d977949a8ee47c94d5c701609d593de91a2b93800acdc1813</cites><orcidid>0000-0001-6718-6653 ; 0000-0003-2303-6009 ; 0000-0002-6776-3868 ; 0000-0002-6984-8208</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8705380$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/31059453$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Pathak, Shreyasi</creatorcontrib><creatorcontrib>van Rossen, Jorit</creatorcontrib><creatorcontrib>Vijlbrief, Onno</creatorcontrib><creatorcontrib>Geerdink, Jeroen</creatorcontrib><creatorcontrib>Seifert, Christin</creatorcontrib><creatorcontrib>van Keulen, Maurice</creatorcontrib><title>Post-Structuring Radiology Reports of Breast Cancer Patients for Clinical Quality Assurance</title><title>IEEE/ACM transactions on computational biology and bioinformatics</title><addtitle>TCBB</addtitle><addtitle>IEEE/ACM Trans Comput Biol Bioinform</addtitle><description>Hospitals often set protocols based on well defined standards to maintain the quality of patient reports. To ensure that the clinicians conform to the protocols, quality assurance of these reports is needed. Patient reports are currently written in free-text format, which complicates the task of quality assurance. In this paper, we present a machine learning based natural language processing system for automatic quality assurance of radiology reports on breast cancer. This is achieved in three steps: we i) identify the top-level structure (headings) of the report, ii) classify the report content into the top-level headings, and iii) convert the free-text detailed findings in the report to a semi-structured format (post-structuring). Top level structure and content of report were predicted with an F1 score of 0.97 and 0.94, respectively, using Support Vector Machine (SVM) classifiers. For automatic structuring, our proposed hierarchical Conditional Random Field (CRF) outperformed the baseline CRF with an F1 score of 0.78 versus 0.71. The determined structure of the report is represented in semi-structured XML format of the free-text report, which helps to easily visualize the conformance of the findings to the protocols. This format also allows easy extraction of specific information for other purposes such as search, evaluation, and research.</description><subject>automatic structuring</subject><subject>Breast cancer</subject><subject>conditional random field</subject><subject>Conditional random fields</subject><subject>Format</subject><subject>Learning algorithms</subject><subject>Machine learning</subject><subject>Natural language processing</subject><subject>Patients</subject><subject>post-structuring</subject><subject>Quality assurance</subject><subject>Quality control</subject><subject>Radiology</subject><subject>radiology reports</subject><subject>Support vector machines</subject><issn>1545-5963</issn><issn>1557-9964</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><recordid>eNpdkE2LFDEQhoMo7of-ABEk4MVLj_nspI47ja7Cgus6Nw8hk65ZsvR0xiR9mH9vNzPuwVMV1FMvVQ8h7zhbcc7g86Zbr1eCcVgJ4Ko19gW55FqbBqBVL5de6UZDKy_IVSlPjAkFTL0mF5IzDUrLS_L7PpXa_Kp5CnXKcXykD76PaUiPR_qAh5RroWlH1xl9qbTzY8BM732NOM6TXcq0G-IYgx_oz8kPsR7pTSlTXsA35NXODwXfnus12Xz9sum-NXc_br93N3dNkCBq09uAAnFrw9YLrnswBhR4i6hMANXrYBhvGfQaZI_AvdiCtIz50Aduubwmn06xh5z-TFiq28cScBj8iGkqTggpuJRz6ox-_A99SlMe5-OcUK01RjBmZ4qfqJBTKRl37pDj3uej48wt4t0i3i3i3Vn8vPPhnDxt99g_b_wzPQPvT0BExOexNUzPv8i_fVqG7w</recordid><startdate>202011</startdate><enddate>202011</enddate><creator>Pathak, Shreyasi</creator><creator>van Rossen, Jorit</creator><creator>Vijlbrief, Onno</creator><creator>Geerdink, Jeroen</creator><creator>Seifert, Christin</creator><creator>van Keulen, Maurice</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QF</scope><scope>7QO</scope><scope>7QQ</scope><scope>7SC</scope><scope>7SE</scope><scope>7SP</scope><scope>7SR</scope><scope>7TA</scope><scope>7TB</scope><scope>7U5</scope><scope>8BQ</scope><scope>8FD</scope><scope>F28</scope><scope>FR3</scope><scope>H8D</scope><scope>JG9</scope><scope>JQ2</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>P64</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0001-6718-6653</orcidid><orcidid>https://orcid.org/0000-0003-2303-6009</orcidid><orcidid>https://orcid.org/0000-0002-6776-3868</orcidid><orcidid>https://orcid.org/0000-0002-6984-8208</orcidid></search><sort><creationdate>202011</creationdate><title>Post-Structuring Radiology Reports of Breast Cancer Patients for Clinical Quality Assurance</title><author>Pathak, Shreyasi ; van Rossen, Jorit ; Vijlbrief, Onno ; Geerdink, Jeroen ; Seifert, Christin ; van Keulen, Maurice</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c392t-d8ce2eeb8cba215d977949a8ee47c94d5c701609d593de91a2b93800acdc1813</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>automatic structuring</topic><topic>Breast cancer</topic><topic>conditional random field</topic><topic>Conditional random fields</topic><topic>Format</topic><topic>Learning algorithms</topic><topic>Machine learning</topic><topic>Natural language processing</topic><topic>Patients</topic><topic>post-structuring</topic><topic>Quality assurance</topic><topic>Quality control</topic><topic>Radiology</topic><topic>radiology reports</topic><topic>Support vector machines</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Pathak, Shreyasi</creatorcontrib><creatorcontrib>van Rossen, Jorit</creatorcontrib><creatorcontrib>Vijlbrief, Onno</creatorcontrib><creatorcontrib>Geerdink, Jeroen</creatorcontrib><creatorcontrib>Seifert, Christin</creatorcontrib><creatorcontrib>van Keulen, Maurice</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005–Present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Aluminium Industry Abstracts</collection><collection>Biotechnology Research Abstracts</collection><collection>Ceramic Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Corrosion Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>Materials Business File</collection><collection>Mechanical &amp; Transportation Engineering Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>ANTE: Abstracts in New Technology &amp; Engineering</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE/ACM transactions on computational biology and bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Pathak, Shreyasi</au><au>van Rossen, Jorit</au><au>Vijlbrief, Onno</au><au>Geerdink, Jeroen</au><au>Seifert, Christin</au><au>van Keulen, Maurice</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Post-Structuring Radiology Reports of Breast Cancer Patients for Clinical Quality Assurance</atitle><jtitle>IEEE/ACM transactions on computational biology and bioinformatics</jtitle><stitle>TCBB</stitle><addtitle>IEEE/ACM Trans Comput Biol Bioinform</addtitle><date>2020-11</date><risdate>2020</risdate><volume>17</volume><issue>6</issue><spage>1883</spage><epage>1894</epage><pages>1883-1894</pages><issn>1545-5963</issn><eissn>1557-9964</eissn><coden>ITCBCY</coden><abstract>Hospitals often set protocols based on well defined standards to maintain the quality of patient reports. To ensure that the clinicians conform to the protocols, quality assurance of these reports is needed. Patient reports are currently written in free-text format, which complicates the task of quality assurance. In this paper, we present a machine learning based natural language processing system for automatic quality assurance of radiology reports on breast cancer. This is achieved in three steps: we i) identify the top-level structure (headings) of the report, ii) classify the report content into the top-level headings, and iii) convert the free-text detailed findings in the report to a semi-structured format (post-structuring). Top level structure and content of report were predicted with an F1 score of 0.97 and 0.94, respectively, using Support Vector Machine (SVM) classifiers. For automatic structuring, our proposed hierarchical Conditional Random Field (CRF) outperformed the baseline CRF with an F1 score of 0.78 versus 0.71. The determined structure of the report is represented in semi-structured XML format of the free-text report, which helps to easily visualize the conformance of the findings to the protocols. This format also allows easy extraction of specific information for other purposes such as search, evaluation, and research.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>31059453</pmid><doi>10.1109/TCBB.2019.2914678</doi><tpages>12</tpages><orcidid>https://orcid.org/0000-0001-6718-6653</orcidid><orcidid>https://orcid.org/0000-0003-2303-6009</orcidid><orcidid>https://orcid.org/0000-0002-6776-3868</orcidid><orcidid>https://orcid.org/0000-0002-6984-8208</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1545-5963
ispartof IEEE/ACM transactions on computational biology and bioinformatics, 2020-11, Vol.17 (6), p.1883-1894
issn 1545-5963
1557-9964
language eng
recordid cdi_proquest_miscellaneous_2232133779
source IEEE Electronic Library (IEL) Journals; Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list)
subjects automatic structuring
Breast cancer
conditional random field
Conditional random fields
Format
Learning algorithms
Machine learning
Natural language processing
Patients
post-structuring
Quality assurance
Quality control
Radiology
radiology reports
Support vector machines
title Post-Structuring Radiology Reports of Breast Cancer Patients for Clinical Quality Assurance
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T21%3A16%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Post-Structuring%20Radiology%20Reports%20of%20Breast%20Cancer%20Patients%20for%20Clinical%20Quality%20Assurance&rft.jtitle=IEEE/ACM%20transactions%20on%20computational%20biology%20and%20bioinformatics&rft.au=Pathak,%20Shreyasi&rft.date=2020-11&rft.volume=17&rft.issue=6&rft.spage=1883&rft.epage=1894&rft.pages=1883-1894&rft.issn=1545-5963&rft.eissn=1557-9964&rft.coden=ITCBCY&rft_id=info:doi/10.1109/TCBB.2019.2914678&rft_dat=%3Cproquest_pubme%3E2468772008%3C/proquest_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c392t-d8ce2eeb8cba215d977949a8ee47c94d5c701609d593de91a2b93800acdc1813%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2468772008&rft_id=info:pmid/31059453&rft_ieee_id=8705380&rfr_iscdi=true