Loading…
Personalized Impression Generation for PET Reports Using Large Language Models
Large language models (LLMs) have shown promise in accelerating radiology reporting by summarizing clinical findings into impressions. However, automatic impression generation for whole-body PET reports presents unique challenges and has received little attention. Our study aimed to evaluate whether...
Saved in:
Published in: | Journal of digital imaging 2024-04, Vol.37 (2), p.471-488 |
---|---|
Main Authors: | , , , , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c361t-4f6753938e8baca2ab5facf5f365444aa811ded4c11633100755868791bf6efc3 |
---|---|
cites | cdi_FETCH-LOGICAL-c361t-4f6753938e8baca2ab5facf5f365444aa811ded4c11633100755868791bf6efc3 |
container_end_page | 488 |
container_issue | 2 |
container_start_page | 471 |
container_title | Journal of digital imaging |
container_volume | 37 |
creator | Tie, Xin Shin, Muheon Pirasteh, Ali Ibrahim, Nevein Huemann, Zachary Castellino, Sharon M Kelly, Kara M Garrett, John Hu, Junjie Cho, Steve Y Bradshaw, Tyler J |
description | Large language models (LLMs) have shown promise in accelerating radiology reporting by summarizing clinical findings into impressions. However, automatic impression generation for whole-body PET reports presents unique challenges and has received little attention. Our study aimed to evaluate whether LLMs can create clinically useful impressions for PET reporting. To this end, we fine-tuned twelve open-source language models on a corpus of 37,370 retrospective PET reports collected from our institution. All models were trained using the teacher-forcing algorithm, with the report findings and patient information as input and the original clinical impressions as reference. An extra input token encoded the reading physician's identity, allowing models to learn physician-specific reporting styles. To compare the performances of different models, we computed various automatic evaluation metrics and benchmarked them against physician preferences, ultimately selecting PEGASUS as the top LLM. To evaluate its clinical utility, three nuclear medicine physicians assessed the PEGASUS-generated impressions and original clinical impressions across 6 quality dimensions (3-point scales) and an overall utility score (5-point scale). Each physician reviewed 12 of their own reports and 12 reports from other physicians. When physicians assessed LLM impressions generated in their own style, 89% were considered clinically acceptable, with a mean utility score of 4.08/5. On average, physicians rated these personalized impressions as comparable in overall utility to the impressions dictated by other physicians (4.03, P = 0.41). In summary, our study demonstrated that personalized impressions generated by PEGASUS were clinically useful in most cases, highlighting its potential to expedite PET reporting by automatically drafting impressions. |
doi_str_mv | 10.1007/s10278-024-00985-3 |
format | article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_proquest_miscellaneous_2925485720</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3041683199</sourcerecordid><originalsourceid>FETCH-LOGICAL-c361t-4f6753938e8baca2ab5facf5f365444aa811ded4c11633100755868791bf6efc3</originalsourceid><addsrcrecordid>eNpdUcFKJDEQDeKi4voDHqTBi5d2K6mkO30SkVEHZndl0XPIdFfGlp7OmHQL-vVmHBXdS9WDevWoV4-xQw6nHKD8FTmIUucgZA5QaZXjFtsTldS5qBC3v-BddhDjAwAgcsQCdtguagQNJeyxPzcUou9t175Qk02Xq0Axtr7PrqinYIc1dD5kN5Pb7B-tfBhidhfbfpHNbFhQqv1itAn89g118Sf74WwX6eC977O7y8ntxXU--3s1vTif5TUWfMilK0qFFWrSc1tbYefK2doph4WSUlqrOW-okTXnRbo6-VVKF7qs-NwV5GrcZ2cb3dU4X1JTUz8E25lVaJc2PBtvW_N90rf3ZuGfDOeAXIkyKZy8KwT_OFIczLKNNXWd7cmP0YhKKKlVKSBRj_-jPvgxpJ9FgyB5oZFXVWKJDasOPsZA7vMaDmZtwWwiMyky8xaZwbR09NXH58pHQPgKudGRRg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3041683199</pqid></control><display><type>article</type><title>Personalized Impression Generation for PET Reports Using Large Language Models</title><source>Springer Link</source><source>PubMed Central</source><creator>Tie, Xin ; Shin, Muheon ; Pirasteh, Ali ; Ibrahim, Nevein ; Huemann, Zachary ; Castellino, Sharon M ; Kelly, Kara M ; Garrett, John ; Hu, Junjie ; Cho, Steve Y ; Bradshaw, Tyler J</creator><creatorcontrib>Tie, Xin ; Shin, Muheon ; Pirasteh, Ali ; Ibrahim, Nevein ; Huemann, Zachary ; Castellino, Sharon M ; Kelly, Kara M ; Garrett, John ; Hu, Junjie ; Cho, Steve Y ; Bradshaw, Tyler J</creatorcontrib><description>Large language models (LLMs) have shown promise in accelerating radiology reporting by summarizing clinical findings into impressions. However, automatic impression generation for whole-body PET reports presents unique challenges and has received little attention. Our study aimed to evaluate whether LLMs can create clinically useful impressions for PET reporting. To this end, we fine-tuned twelve open-source language models on a corpus of 37,370 retrospective PET reports collected from our institution. All models were trained using the teacher-forcing algorithm, with the report findings and patient information as input and the original clinical impressions as reference. An extra input token encoded the reading physician's identity, allowing models to learn physician-specific reporting styles. To compare the performances of different models, we computed various automatic evaluation metrics and benchmarked them against physician preferences, ultimately selecting PEGASUS as the top LLM. To evaluate its clinical utility, three nuclear medicine physicians assessed the PEGASUS-generated impressions and original clinical impressions across 6 quality dimensions (3-point scales) and an overall utility score (5-point scale). Each physician reviewed 12 of their own reports and 12 reports from other physicians. When physicians assessed LLM impressions generated in their own style, 89% were considered clinically acceptable, with a mean utility score of 4.08/5. On average, physicians rated these personalized impressions as comparable in overall utility to the impressions dictated by other physicians (4.03, P = 0.41). In summary, our study demonstrated that personalized impressions generated by PEGASUS were clinically useful in most cases, highlighting its potential to expedite PET reporting by automatically drafting impressions.</description><identifier>ISSN: 2948-2933</identifier><identifier>ISSN: 0897-1889</identifier><identifier>ISSN: 2948-2925</identifier><identifier>EISSN: 2948-2933</identifier><identifier>EISSN: 1618-727X</identifier><identifier>DOI: 10.1007/s10278-024-00985-3</identifier><identifier>PMID: 38308070</identifier><language>eng</language><publisher>Switzerland: Springer Nature B.V</publisher><subject>Algorithms ; Cancer ; Clinical trials ; Customization ; Datasets ; Informatics ; Language ; Large language models ; Lymphoma ; Nuclear medicine ; Patients ; Pediatrics ; Physicians ; Public health ; Radiology ; Reading</subject><ispartof>Journal of digital imaging, 2024-04, Vol.37 (2), p.471-488</ispartof><rights>2024. The Author(s) under exclusive licence to Society for Imaging Informatics in Medicine.</rights><rights>The Author(s) under exclusive licence to Society for Imaging Informatics in Medicine 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c361t-4f6753938e8baca2ab5facf5f365444aa811ded4c11633100755868791bf6efc3</citedby><cites>FETCH-LOGICAL-c361t-4f6753938e8baca2ab5facf5f365444aa811ded4c11633100755868791bf6efc3</cites><orcidid>0000-0001-9549-7002</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC11031527/pdf/$$EPDF$$P50$$Gpubmedcentral$$H</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC11031527/$$EHTML$$P50$$Gpubmedcentral$$H</linktohtml><link.rule.ids>230,314,723,776,780,881,27903,27904,53769,53771</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/38308070$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Tie, Xin</creatorcontrib><creatorcontrib>Shin, Muheon</creatorcontrib><creatorcontrib>Pirasteh, Ali</creatorcontrib><creatorcontrib>Ibrahim, Nevein</creatorcontrib><creatorcontrib>Huemann, Zachary</creatorcontrib><creatorcontrib>Castellino, Sharon M</creatorcontrib><creatorcontrib>Kelly, Kara M</creatorcontrib><creatorcontrib>Garrett, John</creatorcontrib><creatorcontrib>Hu, Junjie</creatorcontrib><creatorcontrib>Cho, Steve Y</creatorcontrib><creatorcontrib>Bradshaw, Tyler J</creatorcontrib><title>Personalized Impression Generation for PET Reports Using Large Language Models</title><title>Journal of digital imaging</title><addtitle>J Imaging Inform Med</addtitle><description>Large language models (LLMs) have shown promise in accelerating radiology reporting by summarizing clinical findings into impressions. However, automatic impression generation for whole-body PET reports presents unique challenges and has received little attention. Our study aimed to evaluate whether LLMs can create clinically useful impressions for PET reporting. To this end, we fine-tuned twelve open-source language models on a corpus of 37,370 retrospective PET reports collected from our institution. All models were trained using the teacher-forcing algorithm, with the report findings and patient information as input and the original clinical impressions as reference. An extra input token encoded the reading physician's identity, allowing models to learn physician-specific reporting styles. To compare the performances of different models, we computed various automatic evaluation metrics and benchmarked them against physician preferences, ultimately selecting PEGASUS as the top LLM. To evaluate its clinical utility, three nuclear medicine physicians assessed the PEGASUS-generated impressions and original clinical impressions across 6 quality dimensions (3-point scales) and an overall utility score (5-point scale). Each physician reviewed 12 of their own reports and 12 reports from other physicians. When physicians assessed LLM impressions generated in their own style, 89% were considered clinically acceptable, with a mean utility score of 4.08/5. On average, physicians rated these personalized impressions as comparable in overall utility to the impressions dictated by other physicians (4.03, P = 0.41). In summary, our study demonstrated that personalized impressions generated by PEGASUS were clinically useful in most cases, highlighting its potential to expedite PET reporting by automatically drafting impressions.</description><subject>Algorithms</subject><subject>Cancer</subject><subject>Clinical trials</subject><subject>Customization</subject><subject>Datasets</subject><subject>Informatics</subject><subject>Language</subject><subject>Large language models</subject><subject>Lymphoma</subject><subject>Nuclear medicine</subject><subject>Patients</subject><subject>Pediatrics</subject><subject>Physicians</subject><subject>Public health</subject><subject>Radiology</subject><subject>Reading</subject><issn>2948-2933</issn><issn>0897-1889</issn><issn>2948-2925</issn><issn>2948-2933</issn><issn>1618-727X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNpdUcFKJDEQDeKi4voDHqTBi5d2K6mkO30SkVEHZndl0XPIdFfGlp7OmHQL-vVmHBXdS9WDevWoV4-xQw6nHKD8FTmIUucgZA5QaZXjFtsTldS5qBC3v-BddhDjAwAgcsQCdtguagQNJeyxPzcUou9t175Qk02Xq0Axtr7PrqinYIc1dD5kN5Pb7B-tfBhidhfbfpHNbFhQqv1itAn89g118Sf74WwX6eC977O7y8ntxXU--3s1vTif5TUWfMilK0qFFWrSc1tbYefK2doph4WSUlqrOW-okTXnRbo6-VVKF7qs-NwV5GrcZ2cb3dU4X1JTUz8E25lVaJc2PBtvW_N90rf3ZuGfDOeAXIkyKZy8KwT_OFIczLKNNXWd7cmP0YhKKKlVKSBRj_-jPvgxpJ9FgyB5oZFXVWKJDasOPsZA7vMaDmZtwWwiMyky8xaZwbR09NXH58pHQPgKudGRRg</recordid><startdate>20240401</startdate><enddate>20240401</enddate><creator>Tie, Xin</creator><creator>Shin, Muheon</creator><creator>Pirasteh, Ali</creator><creator>Ibrahim, Nevein</creator><creator>Huemann, Zachary</creator><creator>Castellino, Sharon M</creator><creator>Kelly, Kara M</creator><creator>Garrett, John</creator><creator>Hu, Junjie</creator><creator>Cho, Steve Y</creator><creator>Bradshaw, Tyler J</creator><general>Springer Nature B.V</general><general>Springer International Publishing</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QO</scope><scope>7SC</scope><scope>7TK</scope><scope>8FD</scope><scope>FR3</scope><scope>JQ2</scope><scope>K9.</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>NAPCQ</scope><scope>P64</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0001-9549-7002</orcidid></search><sort><creationdate>20240401</creationdate><title>Personalized Impression Generation for PET Reports Using Large Language Models</title><author>Tie, Xin ; Shin, Muheon ; Pirasteh, Ali ; Ibrahim, Nevein ; Huemann, Zachary ; Castellino, Sharon M ; Kelly, Kara M ; Garrett, John ; Hu, Junjie ; Cho, Steve Y ; Bradshaw, Tyler J</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c361t-4f6753938e8baca2ab5facf5f365444aa811ded4c11633100755868791bf6efc3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Cancer</topic><topic>Clinical trials</topic><topic>Customization</topic><topic>Datasets</topic><topic>Informatics</topic><topic>Language</topic><topic>Large language models</topic><topic>Lymphoma</topic><topic>Nuclear medicine</topic><topic>Patients</topic><topic>Pediatrics</topic><topic>Physicians</topic><topic>Public health</topic><topic>Radiology</topic><topic>Reading</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Tie, Xin</creatorcontrib><creatorcontrib>Shin, Muheon</creatorcontrib><creatorcontrib>Pirasteh, Ali</creatorcontrib><creatorcontrib>Ibrahim, Nevein</creatorcontrib><creatorcontrib>Huemann, Zachary</creatorcontrib><creatorcontrib>Castellino, Sharon M</creatorcontrib><creatorcontrib>Kelly, Kara M</creatorcontrib><creatorcontrib>Garrett, John</creatorcontrib><creatorcontrib>Hu, Junjie</creatorcontrib><creatorcontrib>Cho, Steve Y</creatorcontrib><creatorcontrib>Bradshaw, Tyler J</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>Biotechnology Research Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Neurosciences Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Nursing & Allied Health Premium</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Journal of digital imaging</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Tie, Xin</au><au>Shin, Muheon</au><au>Pirasteh, Ali</au><au>Ibrahim, Nevein</au><au>Huemann, Zachary</au><au>Castellino, Sharon M</au><au>Kelly, Kara M</au><au>Garrett, John</au><au>Hu, Junjie</au><au>Cho, Steve Y</au><au>Bradshaw, Tyler J</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Personalized Impression Generation for PET Reports Using Large Language Models</atitle><jtitle>Journal of digital imaging</jtitle><addtitle>J Imaging Inform Med</addtitle><date>2024-04-01</date><risdate>2024</risdate><volume>37</volume><issue>2</issue><spage>471</spage><epage>488</epage><pages>471-488</pages><issn>2948-2933</issn><issn>0897-1889</issn><issn>2948-2925</issn><eissn>2948-2933</eissn><eissn>1618-727X</eissn><abstract>Large language models (LLMs) have shown promise in accelerating radiology reporting by summarizing clinical findings into impressions. However, automatic impression generation for whole-body PET reports presents unique challenges and has received little attention. Our study aimed to evaluate whether LLMs can create clinically useful impressions for PET reporting. To this end, we fine-tuned twelve open-source language models on a corpus of 37,370 retrospective PET reports collected from our institution. All models were trained using the teacher-forcing algorithm, with the report findings and patient information as input and the original clinical impressions as reference. An extra input token encoded the reading physician's identity, allowing models to learn physician-specific reporting styles. To compare the performances of different models, we computed various automatic evaluation metrics and benchmarked them against physician preferences, ultimately selecting PEGASUS as the top LLM. To evaluate its clinical utility, three nuclear medicine physicians assessed the PEGASUS-generated impressions and original clinical impressions across 6 quality dimensions (3-point scales) and an overall utility score (5-point scale). Each physician reviewed 12 of their own reports and 12 reports from other physicians. When physicians assessed LLM impressions generated in their own style, 89% were considered clinically acceptable, with a mean utility score of 4.08/5. On average, physicians rated these personalized impressions as comparable in overall utility to the impressions dictated by other physicians (4.03, P = 0.41). In summary, our study demonstrated that personalized impressions generated by PEGASUS were clinically useful in most cases, highlighting its potential to expedite PET reporting by automatically drafting impressions.</abstract><cop>Switzerland</cop><pub>Springer Nature B.V</pub><pmid>38308070</pmid><doi>10.1007/s10278-024-00985-3</doi><tpages>18</tpages><orcidid>https://orcid.org/0000-0001-9549-7002</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2948-2933 |
ispartof | Journal of digital imaging, 2024-04, Vol.37 (2), p.471-488 |
issn | 2948-2933 0897-1889 2948-2925 2948-2933 1618-727X |
language | eng |
recordid | cdi_proquest_miscellaneous_2925485720 |
source | Springer Link; PubMed Central |
subjects | Algorithms Cancer Clinical trials Customization Datasets Informatics Language Large language models Lymphoma Nuclear medicine Patients Pediatrics Physicians Public health Radiology Reading |
title | Personalized Impression Generation for PET Reports Using Large Language Models |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-23T13%3A34%3A32IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Personalized%20Impression%20Generation%20for%20PET%20Reports%20Using%20Large%20Language%20Models&rft.jtitle=Journal%20of%20digital%20imaging&rft.au=Tie,%20Xin&rft.date=2024-04-01&rft.volume=37&rft.issue=2&rft.spage=471&rft.epage=488&rft.pages=471-488&rft.issn=2948-2933&rft.eissn=2948-2933&rft_id=info:doi/10.1007/s10278-024-00985-3&rft_dat=%3Cproquest_pubme%3E3041683199%3C/proquest_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c361t-4f6753938e8baca2ab5facf5f365444aa811ded4c11633100755868791bf6efc3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3041683199&rft_id=info:pmid/38308070&rfr_iscdi=true |