Loading…

A general text mining method to extract echocardiography measurement results from echocardiography documents

In everyday medical practice, the results of cardiac ultrasound examinations are generally recorded in unstructured text, from which extracting relevant information is an important and challenging task. This paper presents a generally applicable language and corpus-independent text mining method for...

Full description

Saved in:
Bibliographic Details
Published in:Artificial intelligence in medicine 2023-09, Vol.143, p.102584-102584, Article 102584
Main Authors: Szekér, Szabolcs, Fogarassy, György, Vathy-Fogarassy, Ágnes
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c339t-dfd4824789dfa4011de71379bb0ebd27884917927dc9f05c4870bf2db1ed38d73
cites cdi_FETCH-LOGICAL-c339t-dfd4824789dfa4011de71379bb0ebd27884917927dc9f05c4870bf2db1ed38d73
container_end_page 102584
container_issue
container_start_page 102584
container_title Artificial intelligence in medicine
container_volume 143
creator Szekér, Szabolcs
Fogarassy, György
Vathy-Fogarassy, Ágnes
description In everyday medical practice, the results of cardiac ultrasound examinations are generally recorded in unstructured text, from which extracting relevant information is an important and challenging task. This paper presents a generally applicable language and corpus-independent text mining method for extracting and structuring numerical measurement results and their descriptions from echocardiography reports. The developed method is based on generally applicable text mining preprocessing activities, it automatically identifies and standardizes the descriptions of the cardiac ultrasound measures, and it stores the extracted and standardized measurement descriptions with their measurement results in a structured form for later usage. The method does not contain any regular expression-based search and does not rely on information about the structure of the document. The method has been tested on a document set containing more than 20,000 echocardiographic reports by examining the efficiency of extracting 12 echocardiography parameters considered important by experts. The method extracted and structured the echocardiography parameters under the study with good sensitivity (lowest value: 0.775, highest value: 1.0, average: 0.904) and excellent specificity (for all cases 1.0). The F1 score ranged between 0.873 and 1.0, and its average value was 0.948. The presented case study has shown that the proposed method can extract measurement results from echocardiography documents with high confidence without performing a direct search or having detailed information about the data recording habits. Furthermore, it effectively handles spelling errors, abbreviations and the highly varied terminology used in descriptions. As it does not rely on any information related to the structure or the language of the documents or data recording habits, it can be applied for processing any free-text written medical texts. [Display omitted] •A novel method for extracting measurement results from echocardiography reports•The method does not require any a priori knowledge about the structure of the reports•Measurement names and results are automatically identified, validated and extracted•The method was evaluated on a corpus containing more than 20,000 reports
doi_str_mv 10.1016/j.artmed.2023.102584
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2862199373</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0933365723000982</els_id><sourcerecordid>2862199373</sourcerecordid><originalsourceid>FETCH-LOGICAL-c339t-dfd4824789dfa4011de71379bb0ebd27884917927dc9f05c4870bf2db1ed38d73</originalsourceid><addsrcrecordid>eNp9kEtLAzEUhYMoWKv_wEWWbqbmMZ0kG6EUX1Bwo-uQSe60KTOTmmTE_nunjDvB1YXDdw7cD6FbShaU0Op-vzAxd-AWjDA-RmwpyzM0o1LwgsmKnKMZUZwXvFqKS3SV0p4QIkpazVC7wlvoIZoWZ_jOuPO977e4g7wLDueAxzAamzHYXbAmOh-20Rx2xxExaYjQQZ9xhDS0OeEmhu4v6YIdTli6RheNaRPc_N45-nh6fF-_FJu359f1alNYzlUuXONKyUohlWtMSSh1ICgXqq4J1I4JKUtFhWLCWdWQpS2lIHXDXE3BcekEn6O7afcQw-cAKevOJwtta3oIQ9KjE0aV4oKPaDmhNoaUIjT6EH1n4lFTok9y9V5PcvVJrp7kjrWHqQbjG18eok7WQ2_B-Qg2axf8_wM_wjKHaQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2862199373</pqid></control><display><type>article</type><title>A general text mining method to extract echocardiography measurement results from echocardiography documents</title><source>Elsevier</source><creator>Szekér, Szabolcs ; Fogarassy, György ; Vathy-Fogarassy, Ágnes</creator><creatorcontrib>Szekér, Szabolcs ; Fogarassy, György ; Vathy-Fogarassy, Ágnes</creatorcontrib><description>In everyday medical practice, the results of cardiac ultrasound examinations are generally recorded in unstructured text, from which extracting relevant information is an important and challenging task. This paper presents a generally applicable language and corpus-independent text mining method for extracting and structuring numerical measurement results and their descriptions from echocardiography reports. The developed method is based on generally applicable text mining preprocessing activities, it automatically identifies and standardizes the descriptions of the cardiac ultrasound measures, and it stores the extracted and standardized measurement descriptions with their measurement results in a structured form for later usage. The method does not contain any regular expression-based search and does not rely on information about the structure of the document. The method has been tested on a document set containing more than 20,000 echocardiographic reports by examining the efficiency of extracting 12 echocardiography parameters considered important by experts. The method extracted and structured the echocardiography parameters under the study with good sensitivity (lowest value: 0.775, highest value: 1.0, average: 0.904) and excellent specificity (for all cases 1.0). The F1 score ranged between 0.873 and 1.0, and its average value was 0.948. The presented case study has shown that the proposed method can extract measurement results from echocardiography documents with high confidence without performing a direct search or having detailed information about the data recording habits. Furthermore, it effectively handles spelling errors, abbreviations and the highly varied terminology used in descriptions. As it does not rely on any information related to the structure or the language of the documents or data recording habits, it can be applied for processing any free-text written medical texts. [Display omitted] •A novel method for extracting measurement results from echocardiography reports•The method does not require any a priori knowledge about the structure of the reports•Measurement names and results are automatically identified, validated and extracted•The method was evaluated on a corpus containing more than 20,000 reports</description><identifier>ISSN: 0933-3657</identifier><identifier>EISSN: 1873-2860</identifier><identifier>DOI: 10.1016/j.artmed.2023.102584</identifier><language>eng</language><publisher>Elsevier B.V</publisher><subject>Clinical text mining ; Echocardiography report ; Information extraction ; Named entity recognition ; Natural language processing</subject><ispartof>Artificial intelligence in medicine, 2023-09, Vol.143, p.102584-102584, Article 102584</ispartof><rights>2023 Elsevier B.V.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c339t-dfd4824789dfa4011de71379bb0ebd27884917927dc9f05c4870bf2db1ed38d73</citedby><cites>FETCH-LOGICAL-c339t-dfd4824789dfa4011de71379bb0ebd27884917927dc9f05c4870bf2db1ed38d73</cites><orcidid>0000-0002-5524-1675 ; 0000-0002-3698-9303</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Szekér, Szabolcs</creatorcontrib><creatorcontrib>Fogarassy, György</creatorcontrib><creatorcontrib>Vathy-Fogarassy, Ágnes</creatorcontrib><title>A general text mining method to extract echocardiography measurement results from echocardiography documents</title><title>Artificial intelligence in medicine</title><description>In everyday medical practice, the results of cardiac ultrasound examinations are generally recorded in unstructured text, from which extracting relevant information is an important and challenging task. This paper presents a generally applicable language and corpus-independent text mining method for extracting and structuring numerical measurement results and their descriptions from echocardiography reports. The developed method is based on generally applicable text mining preprocessing activities, it automatically identifies and standardizes the descriptions of the cardiac ultrasound measures, and it stores the extracted and standardized measurement descriptions with their measurement results in a structured form for later usage. The method does not contain any regular expression-based search and does not rely on information about the structure of the document. The method has been tested on a document set containing more than 20,000 echocardiographic reports by examining the efficiency of extracting 12 echocardiography parameters considered important by experts. The method extracted and structured the echocardiography parameters under the study with good sensitivity (lowest value: 0.775, highest value: 1.0, average: 0.904) and excellent specificity (for all cases 1.0). The F1 score ranged between 0.873 and 1.0, and its average value was 0.948. The presented case study has shown that the proposed method can extract measurement results from echocardiography documents with high confidence without performing a direct search or having detailed information about the data recording habits. Furthermore, it effectively handles spelling errors, abbreviations and the highly varied terminology used in descriptions. As it does not rely on any information related to the structure or the language of the documents or data recording habits, it can be applied for processing any free-text written medical texts. [Display omitted] •A novel method for extracting measurement results from echocardiography reports•The method does not require any a priori knowledge about the structure of the reports•Measurement names and results are automatically identified, validated and extracted•The method was evaluated on a corpus containing more than 20,000 reports</description><subject>Clinical text mining</subject><subject>Echocardiography report</subject><subject>Information extraction</subject><subject>Named entity recognition</subject><subject>Natural language processing</subject><issn>0933-3657</issn><issn>1873-2860</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNp9kEtLAzEUhYMoWKv_wEWWbqbmMZ0kG6EUX1Bwo-uQSe60KTOTmmTE_nunjDvB1YXDdw7cD6FbShaU0Op-vzAxd-AWjDA-RmwpyzM0o1LwgsmKnKMZUZwXvFqKS3SV0p4QIkpazVC7wlvoIZoWZ_jOuPO977e4g7wLDueAxzAamzHYXbAmOh-20Rx2xxExaYjQQZ9xhDS0OeEmhu4v6YIdTli6RheNaRPc_N45-nh6fF-_FJu359f1alNYzlUuXONKyUohlWtMSSh1ICgXqq4J1I4JKUtFhWLCWdWQpS2lIHXDXE3BcekEn6O7afcQw-cAKevOJwtta3oIQ9KjE0aV4oKPaDmhNoaUIjT6EH1n4lFTok9y9V5PcvVJrp7kjrWHqQbjG18eok7WQ2_B-Qg2axf8_wM_wjKHaQ</recordid><startdate>202309</startdate><enddate>202309</enddate><creator>Szekér, Szabolcs</creator><creator>Fogarassy, György</creator><creator>Vathy-Fogarassy, Ágnes</creator><general>Elsevier B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-5524-1675</orcidid><orcidid>https://orcid.org/0000-0002-3698-9303</orcidid></search><sort><creationdate>202309</creationdate><title>A general text mining method to extract echocardiography measurement results from echocardiography documents</title><author>Szekér, Szabolcs ; Fogarassy, György ; Vathy-Fogarassy, Ágnes</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c339t-dfd4824789dfa4011de71379bb0ebd27884917927dc9f05c4870bf2db1ed38d73</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Clinical text mining</topic><topic>Echocardiography report</topic><topic>Information extraction</topic><topic>Named entity recognition</topic><topic>Natural language processing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Szekér, Szabolcs</creatorcontrib><creatorcontrib>Fogarassy, György</creatorcontrib><creatorcontrib>Vathy-Fogarassy, Ágnes</creatorcontrib><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Artificial intelligence in medicine</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Szekér, Szabolcs</au><au>Fogarassy, György</au><au>Vathy-Fogarassy, Ágnes</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A general text mining method to extract echocardiography measurement results from echocardiography documents</atitle><jtitle>Artificial intelligence in medicine</jtitle><date>2023-09</date><risdate>2023</risdate><volume>143</volume><spage>102584</spage><epage>102584</epage><pages>102584-102584</pages><artnum>102584</artnum><issn>0933-3657</issn><eissn>1873-2860</eissn><abstract>In everyday medical practice, the results of cardiac ultrasound examinations are generally recorded in unstructured text, from which extracting relevant information is an important and challenging task. This paper presents a generally applicable language and corpus-independent text mining method for extracting and structuring numerical measurement results and their descriptions from echocardiography reports. The developed method is based on generally applicable text mining preprocessing activities, it automatically identifies and standardizes the descriptions of the cardiac ultrasound measures, and it stores the extracted and standardized measurement descriptions with their measurement results in a structured form for later usage. The method does not contain any regular expression-based search and does not rely on information about the structure of the document. The method has been tested on a document set containing more than 20,000 echocardiographic reports by examining the efficiency of extracting 12 echocardiography parameters considered important by experts. The method extracted and structured the echocardiography parameters under the study with good sensitivity (lowest value: 0.775, highest value: 1.0, average: 0.904) and excellent specificity (for all cases 1.0). The F1 score ranged between 0.873 and 1.0, and its average value was 0.948. The presented case study has shown that the proposed method can extract measurement results from echocardiography documents with high confidence without performing a direct search or having detailed information about the data recording habits. Furthermore, it effectively handles spelling errors, abbreviations and the highly varied terminology used in descriptions. As it does not rely on any information related to the structure or the language of the documents or data recording habits, it can be applied for processing any free-text written medical texts. [Display omitted] •A novel method for extracting measurement results from echocardiography reports•The method does not require any a priori knowledge about the structure of the reports•Measurement names and results are automatically identified, validated and extracted•The method was evaluated on a corpus containing more than 20,000 reports</abstract><pub>Elsevier B.V</pub><doi>10.1016/j.artmed.2023.102584</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0002-5524-1675</orcidid><orcidid>https://orcid.org/0000-0002-3698-9303</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0933-3657
ispartof Artificial intelligence in medicine, 2023-09, Vol.143, p.102584-102584, Article 102584
issn 0933-3657
1873-2860
language eng
recordid cdi_proquest_miscellaneous_2862199373
source Elsevier
subjects Clinical text mining
Echocardiography report
Information extraction
Named entity recognition
Natural language processing
title A general text mining method to extract echocardiography measurement results from echocardiography documents
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T15%3A42%3A23IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20general%20text%20mining%20method%20to%20extract%20echocardiography%20measurement%20results%20from%20echocardiography%20documents&rft.jtitle=Artificial%20intelligence%20in%20medicine&rft.au=Szek%C3%A9r,%20Szabolcs&rft.date=2023-09&rft.volume=143&rft.spage=102584&rft.epage=102584&rft.pages=102584-102584&rft.artnum=102584&rft.issn=0933-3657&rft.eissn=1873-2860&rft_id=info:doi/10.1016/j.artmed.2023.102584&rft_dat=%3Cproquest_cross%3E2862199373%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c339t-dfd4824789dfa4011de71379bb0ebd27884917927dc9f05c4870bf2db1ed38d73%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2862199373&rft_id=info:pmid/&rfr_iscdi=true