Loading…

Personalized Impression Generation for PET Reports Using Large Language Models

Large language models (LLMs) have shown promise in accelerating radiology reporting by summarizing clinical findings into impressions. However, automatic impression generation for whole-body PET reports presents unique challenges and has received little attention. Our study aimed to evaluate whether...

Full description

Saved in:
Bibliographic Details
Published in:Journal of digital imaging 2024-04, Vol.37 (2), p.471-488
Main Authors: Tie, Xin, Shin, Muheon, Pirasteh, Ali, Ibrahim, Nevein, Huemann, Zachary, Castellino, Sharon M, Kelly, Kara M, Garrett, John, Hu, Junjie, Cho, Steve Y, Bradshaw, Tyler J
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c361t-4f6753938e8baca2ab5facf5f365444aa811ded4c11633100755868791bf6efc3
cites cdi_FETCH-LOGICAL-c361t-4f6753938e8baca2ab5facf5f365444aa811ded4c11633100755868791bf6efc3
container_end_page 488
container_issue 2
container_start_page 471
container_title Journal of digital imaging
container_volume 37
creator Tie, Xin
Shin, Muheon
Pirasteh, Ali
Ibrahim, Nevein
Huemann, Zachary
Castellino, Sharon M
Kelly, Kara M
Garrett, John
Hu, Junjie
Cho, Steve Y
Bradshaw, Tyler J
description Large language models (LLMs) have shown promise in accelerating radiology reporting by summarizing clinical findings into impressions. However, automatic impression generation for whole-body PET reports presents unique challenges and has received little attention. Our study aimed to evaluate whether LLMs can create clinically useful impressions for PET reporting. To this end, we fine-tuned twelve open-source language models on a corpus of 37,370 retrospective PET reports collected from our institution. All models were trained using the teacher-forcing algorithm, with the report findings and patient information as input and the original clinical impressions as reference. An extra input token encoded the reading physician's identity, allowing models to learn physician-specific reporting styles. To compare the performances of different models, we computed various automatic evaluation metrics and benchmarked them against physician preferences, ultimately selecting PEGASUS as the top LLM. To evaluate its clinical utility, three nuclear medicine physicians assessed the PEGASUS-generated impressions and original clinical impressions across 6 quality dimensions (3-point scales) and an overall utility score (5-point scale). Each physician reviewed 12 of their own reports and 12 reports from other physicians. When physicians assessed LLM impressions generated in their own style, 89% were considered clinically acceptable, with a mean utility score of 4.08/5. On average, physicians rated these personalized impressions as comparable in overall utility to the impressions dictated by other physicians (4.03, P = 0.41). In summary, our study demonstrated that personalized impressions generated by PEGASUS were clinically useful in most cases, highlighting its potential to expedite PET reporting by automatically drafting impressions.
doi_str_mv 10.1007/s10278-024-00985-3
format article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_proquest_miscellaneous_2925485720</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3041683199</sourcerecordid><originalsourceid>FETCH-LOGICAL-c361t-4f6753938e8baca2ab5facf5f365444aa811ded4c11633100755868791bf6efc3</originalsourceid><addsrcrecordid>eNpdUcFKJDEQDeKi4voDHqTBi5d2K6mkO30SkVEHZndl0XPIdFfGlp7OmHQL-vVmHBXdS9WDevWoV4-xQw6nHKD8FTmIUucgZA5QaZXjFtsTldS5qBC3v-BddhDjAwAgcsQCdtguagQNJeyxPzcUou9t175Qk02Xq0Axtr7PrqinYIc1dD5kN5Pb7B-tfBhidhfbfpHNbFhQqv1itAn89g118Sf74WwX6eC977O7y8ntxXU--3s1vTif5TUWfMilK0qFFWrSc1tbYefK2doph4WSUlqrOW-okTXnRbo6-VVKF7qs-NwV5GrcZ2cb3dU4X1JTUz8E25lVaJc2PBtvW_N90rf3ZuGfDOeAXIkyKZy8KwT_OFIczLKNNXWd7cmP0YhKKKlVKSBRj_-jPvgxpJ9FgyB5oZFXVWKJDasOPsZA7vMaDmZtwWwiMyky8xaZwbR09NXH58pHQPgKudGRRg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3041683199</pqid></control><display><type>article</type><title>Personalized Impression Generation for PET Reports Using Large Language Models</title><source>Springer Link</source><source>PubMed Central</source><creator>Tie, Xin ; Shin, Muheon ; Pirasteh, Ali ; Ibrahim, Nevein ; Huemann, Zachary ; Castellino, Sharon M ; Kelly, Kara M ; Garrett, John ; Hu, Junjie ; Cho, Steve Y ; Bradshaw, Tyler J</creator><creatorcontrib>Tie, Xin ; Shin, Muheon ; Pirasteh, Ali ; Ibrahim, Nevein ; Huemann, Zachary ; Castellino, Sharon M ; Kelly, Kara M ; Garrett, John ; Hu, Junjie ; Cho, Steve Y ; Bradshaw, Tyler J</creatorcontrib><description>Large language models (LLMs) have shown promise in accelerating radiology reporting by summarizing clinical findings into impressions. However, automatic impression generation for whole-body PET reports presents unique challenges and has received little attention. Our study aimed to evaluate whether LLMs can create clinically useful impressions for PET reporting. To this end, we fine-tuned twelve open-source language models on a corpus of 37,370 retrospective PET reports collected from our institution. All models were trained using the teacher-forcing algorithm, with the report findings and patient information as input and the original clinical impressions as reference. An extra input token encoded the reading physician's identity, allowing models to learn physician-specific reporting styles. To compare the performances of different models, we computed various automatic evaluation metrics and benchmarked them against physician preferences, ultimately selecting PEGASUS as the top LLM. To evaluate its clinical utility, three nuclear medicine physicians assessed the PEGASUS-generated impressions and original clinical impressions across 6 quality dimensions (3-point scales) and an overall utility score (5-point scale). Each physician reviewed 12 of their own reports and 12 reports from other physicians. When physicians assessed LLM impressions generated in their own style, 89% were considered clinically acceptable, with a mean utility score of 4.08/5. On average, physicians rated these personalized impressions as comparable in overall utility to the impressions dictated by other physicians (4.03, P = 0.41). In summary, our study demonstrated that personalized impressions generated by PEGASUS were clinically useful in most cases, highlighting its potential to expedite PET reporting by automatically drafting impressions.</description><identifier>ISSN: 2948-2933</identifier><identifier>ISSN: 0897-1889</identifier><identifier>ISSN: 2948-2925</identifier><identifier>EISSN: 2948-2933</identifier><identifier>EISSN: 1618-727X</identifier><identifier>DOI: 10.1007/s10278-024-00985-3</identifier><identifier>PMID: 38308070</identifier><language>eng</language><publisher>Switzerland: Springer Nature B.V</publisher><subject>Algorithms ; Cancer ; Clinical trials ; Customization ; Datasets ; Informatics ; Language ; Large language models ; Lymphoma ; Nuclear medicine ; Patients ; Pediatrics ; Physicians ; Public health ; Radiology ; Reading</subject><ispartof>Journal of digital imaging, 2024-04, Vol.37 (2), p.471-488</ispartof><rights>2024. The Author(s) under exclusive licence to Society for Imaging Informatics in Medicine.</rights><rights>The Author(s) under exclusive licence to Society for Imaging Informatics in Medicine 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c361t-4f6753938e8baca2ab5facf5f365444aa811ded4c11633100755868791bf6efc3</citedby><cites>FETCH-LOGICAL-c361t-4f6753938e8baca2ab5facf5f365444aa811ded4c11633100755868791bf6efc3</cites><orcidid>0000-0001-9549-7002</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC11031527/pdf/$$EPDF$$P50$$Gpubmedcentral$$H</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC11031527/$$EHTML$$P50$$Gpubmedcentral$$H</linktohtml><link.rule.ids>230,314,723,776,780,881,27903,27904,53769,53771</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/38308070$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Tie, Xin</creatorcontrib><creatorcontrib>Shin, Muheon</creatorcontrib><creatorcontrib>Pirasteh, Ali</creatorcontrib><creatorcontrib>Ibrahim, Nevein</creatorcontrib><creatorcontrib>Huemann, Zachary</creatorcontrib><creatorcontrib>Castellino, Sharon M</creatorcontrib><creatorcontrib>Kelly, Kara M</creatorcontrib><creatorcontrib>Garrett, John</creatorcontrib><creatorcontrib>Hu, Junjie</creatorcontrib><creatorcontrib>Cho, Steve Y</creatorcontrib><creatorcontrib>Bradshaw, Tyler J</creatorcontrib><title>Personalized Impression Generation for PET Reports Using Large Language Models</title><title>Journal of digital imaging</title><addtitle>J Imaging Inform Med</addtitle><description>Large language models (LLMs) have shown promise in accelerating radiology reporting by summarizing clinical findings into impressions. However, automatic impression generation for whole-body PET reports presents unique challenges and has received little attention. Our study aimed to evaluate whether LLMs can create clinically useful impressions for PET reporting. To this end, we fine-tuned twelve open-source language models on a corpus of 37,370 retrospective PET reports collected from our institution. All models were trained using the teacher-forcing algorithm, with the report findings and patient information as input and the original clinical impressions as reference. An extra input token encoded the reading physician's identity, allowing models to learn physician-specific reporting styles. To compare the performances of different models, we computed various automatic evaluation metrics and benchmarked them against physician preferences, ultimately selecting PEGASUS as the top LLM. To evaluate its clinical utility, three nuclear medicine physicians assessed the PEGASUS-generated impressions and original clinical impressions across 6 quality dimensions (3-point scales) and an overall utility score (5-point scale). Each physician reviewed 12 of their own reports and 12 reports from other physicians. When physicians assessed LLM impressions generated in their own style, 89% were considered clinically acceptable, with a mean utility score of 4.08/5. On average, physicians rated these personalized impressions as comparable in overall utility to the impressions dictated by other physicians (4.03, P = 0.41). In summary, our study demonstrated that personalized impressions generated by PEGASUS were clinically useful in most cases, highlighting its potential to expedite PET reporting by automatically drafting impressions.</description><subject>Algorithms</subject><subject>Cancer</subject><subject>Clinical trials</subject><subject>Customization</subject><subject>Datasets</subject><subject>Informatics</subject><subject>Language</subject><subject>Large language models</subject><subject>Lymphoma</subject><subject>Nuclear medicine</subject><subject>Patients</subject><subject>Pediatrics</subject><subject>Physicians</subject><subject>Public health</subject><subject>Radiology</subject><subject>Reading</subject><issn>2948-2933</issn><issn>0897-1889</issn><issn>2948-2925</issn><issn>2948-2933</issn><issn>1618-727X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNpdUcFKJDEQDeKi4voDHqTBi5d2K6mkO30SkVEHZndl0XPIdFfGlp7OmHQL-vVmHBXdS9WDevWoV4-xQw6nHKD8FTmIUucgZA5QaZXjFtsTldS5qBC3v-BddhDjAwAgcsQCdtguagQNJeyxPzcUou9t175Qk02Xq0Axtr7PrqinYIc1dD5kN5Pb7B-tfBhidhfbfpHNbFhQqv1itAn89g118Sf74WwX6eC977O7y8ntxXU--3s1vTif5TUWfMilK0qFFWrSc1tbYefK2doph4WSUlqrOW-okTXnRbo6-VVKF7qs-NwV5GrcZ2cb3dU4X1JTUz8E25lVaJc2PBtvW_N90rf3ZuGfDOeAXIkyKZy8KwT_OFIczLKNNXWd7cmP0YhKKKlVKSBRj_-jPvgxpJ9FgyB5oZFXVWKJDasOPsZA7vMaDmZtwWwiMyky8xaZwbR09NXH58pHQPgKudGRRg</recordid><startdate>20240401</startdate><enddate>20240401</enddate><creator>Tie, Xin</creator><creator>Shin, Muheon</creator><creator>Pirasteh, Ali</creator><creator>Ibrahim, Nevein</creator><creator>Huemann, Zachary</creator><creator>Castellino, Sharon M</creator><creator>Kelly, Kara M</creator><creator>Garrett, John</creator><creator>Hu, Junjie</creator><creator>Cho, Steve Y</creator><creator>Bradshaw, Tyler J</creator><general>Springer Nature B.V</general><general>Springer International Publishing</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QO</scope><scope>7SC</scope><scope>7TK</scope><scope>8FD</scope><scope>FR3</scope><scope>JQ2</scope><scope>K9.</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>NAPCQ</scope><scope>P64</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0001-9549-7002</orcidid></search><sort><creationdate>20240401</creationdate><title>Personalized Impression Generation for PET Reports Using Large Language Models</title><author>Tie, Xin ; Shin, Muheon ; Pirasteh, Ali ; Ibrahim, Nevein ; Huemann, Zachary ; Castellino, Sharon M ; Kelly, Kara M ; Garrett, John ; Hu, Junjie ; Cho, Steve Y ; Bradshaw, Tyler J</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c361t-4f6753938e8baca2ab5facf5f365444aa811ded4c11633100755868791bf6efc3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Cancer</topic><topic>Clinical trials</topic><topic>Customization</topic><topic>Datasets</topic><topic>Informatics</topic><topic>Language</topic><topic>Large language models</topic><topic>Lymphoma</topic><topic>Nuclear medicine</topic><topic>Patients</topic><topic>Pediatrics</topic><topic>Physicians</topic><topic>Public health</topic><topic>Radiology</topic><topic>Reading</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Tie, Xin</creatorcontrib><creatorcontrib>Shin, Muheon</creatorcontrib><creatorcontrib>Pirasteh, Ali</creatorcontrib><creatorcontrib>Ibrahim, Nevein</creatorcontrib><creatorcontrib>Huemann, Zachary</creatorcontrib><creatorcontrib>Castellino, Sharon M</creatorcontrib><creatorcontrib>Kelly, Kara M</creatorcontrib><creatorcontrib>Garrett, John</creatorcontrib><creatorcontrib>Hu, Junjie</creatorcontrib><creatorcontrib>Cho, Steve Y</creatorcontrib><creatorcontrib>Bradshaw, Tyler J</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>Biotechnology Research Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Neurosciences Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Nursing &amp; Allied Health Premium</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Journal of digital imaging</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Tie, Xin</au><au>Shin, Muheon</au><au>Pirasteh, Ali</au><au>Ibrahim, Nevein</au><au>Huemann, Zachary</au><au>Castellino, Sharon M</au><au>Kelly, Kara M</au><au>Garrett, John</au><au>Hu, Junjie</au><au>Cho, Steve Y</au><au>Bradshaw, Tyler J</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Personalized Impression Generation for PET Reports Using Large Language Models</atitle><jtitle>Journal of digital imaging</jtitle><addtitle>J Imaging Inform Med</addtitle><date>2024-04-01</date><risdate>2024</risdate><volume>37</volume><issue>2</issue><spage>471</spage><epage>488</epage><pages>471-488</pages><issn>2948-2933</issn><issn>0897-1889</issn><issn>2948-2925</issn><eissn>2948-2933</eissn><eissn>1618-727X</eissn><abstract>Large language models (LLMs) have shown promise in accelerating radiology reporting by summarizing clinical findings into impressions. However, automatic impression generation for whole-body PET reports presents unique challenges and has received little attention. Our study aimed to evaluate whether LLMs can create clinically useful impressions for PET reporting. To this end, we fine-tuned twelve open-source language models on a corpus of 37,370 retrospective PET reports collected from our institution. All models were trained using the teacher-forcing algorithm, with the report findings and patient information as input and the original clinical impressions as reference. An extra input token encoded the reading physician's identity, allowing models to learn physician-specific reporting styles. To compare the performances of different models, we computed various automatic evaluation metrics and benchmarked them against physician preferences, ultimately selecting PEGASUS as the top LLM. To evaluate its clinical utility, three nuclear medicine physicians assessed the PEGASUS-generated impressions and original clinical impressions across 6 quality dimensions (3-point scales) and an overall utility score (5-point scale). Each physician reviewed 12 of their own reports and 12 reports from other physicians. When physicians assessed LLM impressions generated in their own style, 89% were considered clinically acceptable, with a mean utility score of 4.08/5. On average, physicians rated these personalized impressions as comparable in overall utility to the impressions dictated by other physicians (4.03, P = 0.41). In summary, our study demonstrated that personalized impressions generated by PEGASUS were clinically useful in most cases, highlighting its potential to expedite PET reporting by automatically drafting impressions.</abstract><cop>Switzerland</cop><pub>Springer Nature B.V</pub><pmid>38308070</pmid><doi>10.1007/s10278-024-00985-3</doi><tpages>18</tpages><orcidid>https://orcid.org/0000-0001-9549-7002</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2948-2933
ispartof Journal of digital imaging, 2024-04, Vol.37 (2), p.471-488
issn 2948-2933
0897-1889
2948-2925
2948-2933
1618-727X
language eng
recordid cdi_proquest_miscellaneous_2925485720
source Springer Link; PubMed Central
subjects Algorithms
Cancer
Clinical trials
Customization
Datasets
Informatics
Language
Large language models
Lymphoma
Nuclear medicine
Patients
Pediatrics
Physicians
Public health
Radiology
Reading
title Personalized Impression Generation for PET Reports Using Large Language Models
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-23T13%3A34%3A32IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Personalized%20Impression%20Generation%20for%20PET%20Reports%20Using%20Large%20Language%20Models&rft.jtitle=Journal%20of%20digital%20imaging&rft.au=Tie,%20Xin&rft.date=2024-04-01&rft.volume=37&rft.issue=2&rft.spage=471&rft.epage=488&rft.pages=471-488&rft.issn=2948-2933&rft.eissn=2948-2933&rft_id=info:doi/10.1007/s10278-024-00985-3&rft_dat=%3Cproquest_pubme%3E3041683199%3C/proquest_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c361t-4f6753938e8baca2ab5facf5f365444aa811ded4c11633100755868791bf6efc3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3041683199&rft_id=info:pmid/38308070&rfr_iscdi=true