Loading…

Generative AI for Synthetic Data Across Multiple Medical Modalities: A Systematic Review of Recent Developments and Challenges

This paper presents a comprehensive systematic review of generative models (GANs, VAEs, DMs, and LLMs) used to synthesize various medical data types, including imaging (dermoscopic, mammographic, ultrasound, CT, MRI, and X-ray), text, time-series, and tabular data (EHR). Unlike previous narrowly foc...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2024-07
Main Authors: Ibrahim, Mahmoud, Yasmina Al Khalil, Amirrajab, Sina, Chang, Sun, Breeuwer, Marcel, Pluim, Josien, Elen, Bart, Ertaylan, Gokhan, Dumontier, Michel
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Ibrahim, Mahmoud
Yasmina Al Khalil
Amirrajab, Sina
Chang, Sun
Breeuwer, Marcel
Pluim, Josien
Elen, Bart
Ertaylan, Gokhan
Dumontier, Michel
description This paper presents a comprehensive systematic review of generative models (GANs, VAEs, DMs, and LLMs) used to synthesize various medical data types, including imaging (dermoscopic, mammographic, ultrasound, CT, MRI, and X-ray), text, time-series, and tabular data (EHR). Unlike previous narrowly focused reviews, our study encompasses a broad array of medical data modalities and explores various generative models. Our search strategy queries databases such as Scopus, PubMed, and ArXiv, focusing on recent works from January 2021 to November 2023, excluding reviews and perspectives. This period emphasizes recent advancements beyond GANs, which have been extensively covered previously. The survey reveals insights from three key aspects: (1) Synthesis applications and purpose of synthesis, (2) generation techniques, and (3) evaluation methods. It highlights clinically valid synthesis applications, demonstrating the potential of synthetic data to tackle diverse clinical requirements. While conditional models incorporating class labels, segmentation masks and image translations are prevalent, there is a gap in utilizing prior clinical knowledge and patient-specific context, suggesting a need for more personalized synthesis approaches and emphasizing the importance of tailoring generative approaches to the unique characteristics of medical data. Additionally, there is a significant gap in using synthetic data beyond augmentation, such as for validation and evaluation of downstream medical AI models. The survey uncovers that the lack of standardized evaluation methodologies tailored to medical images is a barrier to clinical application, underscoring the need for in-depth evaluation approaches, benchmarking, and comparative studies to promote openness and collaboration.
format article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3075439799</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3075439799</sourcerecordid><originalsourceid>FETCH-proquest_journals_30754397993</originalsourceid><addsrcrecordid>eNqNi8tqwzAQRUWg0NDmHwa6DriSXdfdmbzahTdJ9mGwx42CIjmasUM2-fa60A_o6h6450zUVBvzOn9PtX5UM-ZTkiT6LddZZqbqviFPEcUOBOUXtCHC7ublSGJrWKIglHUMzFD1TmznCCpqbI0OqtCgs2KJP6AcIxY642-1pcHSFUI7Uk1eYEkDudCdR2ZA38DiiM6R_yZ-Vg8tOqbZ3z6pl_Vqv_icdzFcemI5nEIf_XgdTJJnqSnyojD_s34A6YtPXA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3075439799</pqid></control><display><type>article</type><title>Generative AI for Synthetic Data Across Multiple Medical Modalities: A Systematic Review of Recent Developments and Challenges</title><source>Publicly Available Content (ProQuest)</source><creator>Ibrahim, Mahmoud ; Yasmina Al Khalil ; Amirrajab, Sina ; Chang, Sun ; Breeuwer, Marcel ; Pluim, Josien ; Elen, Bart ; Ertaylan, Gokhan ; Dumontier, Michel</creator><creatorcontrib>Ibrahim, Mahmoud ; Yasmina Al Khalil ; Amirrajab, Sina ; Chang, Sun ; Breeuwer, Marcel ; Pluim, Josien ; Elen, Bart ; Ertaylan, Gokhan ; Dumontier, Michel</creatorcontrib><description>This paper presents a comprehensive systematic review of generative models (GANs, VAEs, DMs, and LLMs) used to synthesize various medical data types, including imaging (dermoscopic, mammographic, ultrasound, CT, MRI, and X-ray), text, time-series, and tabular data (EHR). Unlike previous narrowly focused reviews, our study encompasses a broad array of medical data modalities and explores various generative models. Our search strategy queries databases such as Scopus, PubMed, and ArXiv, focusing on recent works from January 2021 to November 2023, excluding reviews and perspectives. This period emphasizes recent advancements beyond GANs, which have been extensively covered previously. The survey reveals insights from three key aspects: (1) Synthesis applications and purpose of synthesis, (2) generation techniques, and (3) evaluation methods. It highlights clinically valid synthesis applications, demonstrating the potential of synthetic data to tackle diverse clinical requirements. While conditional models incorporating class labels, segmentation masks and image translations are prevalent, there is a gap in utilizing prior clinical knowledge and patient-specific context, suggesting a need for more personalized synthesis approaches and emphasizing the importance of tailoring generative approaches to the unique characteristics of medical data. Additionally, there is a significant gap in using synthetic data beyond augmentation, such as for validation and evaluation of downstream medical AI models. The survey uncovers that the lack of standardized evaluation methodologies tailored to medical images is a barrier to clinical application, underscoring the need for in-depth evaluation approaches, benchmarking, and comparative studies to promote openness and collaboration.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Comparative studies ; Computed tomography ; Generative artificial intelligence ; Image segmentation ; Medical imaging ; Synthetic data ; Systematic review ; Translations</subject><ispartof>arXiv.org, 2024-07</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/3075439799?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>776,780,25731,36989,44566</link.rule.ids></links><search><creatorcontrib>Ibrahim, Mahmoud</creatorcontrib><creatorcontrib>Yasmina Al Khalil</creatorcontrib><creatorcontrib>Amirrajab, Sina</creatorcontrib><creatorcontrib>Chang, Sun</creatorcontrib><creatorcontrib>Breeuwer, Marcel</creatorcontrib><creatorcontrib>Pluim, Josien</creatorcontrib><creatorcontrib>Elen, Bart</creatorcontrib><creatorcontrib>Ertaylan, Gokhan</creatorcontrib><creatorcontrib>Dumontier, Michel</creatorcontrib><title>Generative AI for Synthetic Data Across Multiple Medical Modalities: A Systematic Review of Recent Developments and Challenges</title><title>arXiv.org</title><description>This paper presents a comprehensive systematic review of generative models (GANs, VAEs, DMs, and LLMs) used to synthesize various medical data types, including imaging (dermoscopic, mammographic, ultrasound, CT, MRI, and X-ray), text, time-series, and tabular data (EHR). Unlike previous narrowly focused reviews, our study encompasses a broad array of medical data modalities and explores various generative models. Our search strategy queries databases such as Scopus, PubMed, and ArXiv, focusing on recent works from January 2021 to November 2023, excluding reviews and perspectives. This period emphasizes recent advancements beyond GANs, which have been extensively covered previously. The survey reveals insights from three key aspects: (1) Synthesis applications and purpose of synthesis, (2) generation techniques, and (3) evaluation methods. It highlights clinically valid synthesis applications, demonstrating the potential of synthetic data to tackle diverse clinical requirements. While conditional models incorporating class labels, segmentation masks and image translations are prevalent, there is a gap in utilizing prior clinical knowledge and patient-specific context, suggesting a need for more personalized synthesis approaches and emphasizing the importance of tailoring generative approaches to the unique characteristics of medical data. Additionally, there is a significant gap in using synthetic data beyond augmentation, such as for validation and evaluation of downstream medical AI models. The survey uncovers that the lack of standardized evaluation methodologies tailored to medical images is a barrier to clinical application, underscoring the need for in-depth evaluation approaches, benchmarking, and comparative studies to promote openness and collaboration.</description><subject>Comparative studies</subject><subject>Computed tomography</subject><subject>Generative artificial intelligence</subject><subject>Image segmentation</subject><subject>Medical imaging</subject><subject>Synthetic data</subject><subject>Systematic review</subject><subject>Translations</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNi8tqwzAQRUWg0NDmHwa6DriSXdfdmbzahTdJ9mGwx42CIjmasUM2-fa60A_o6h6450zUVBvzOn9PtX5UM-ZTkiT6LddZZqbqviFPEcUOBOUXtCHC7ublSGJrWKIglHUMzFD1TmznCCpqbI0OqtCgs2KJP6AcIxY642-1pcHSFUI7Uk1eYEkDudCdR2ZA38DiiM6R_yZ-Vg8tOqbZ3z6pl_Vqv_icdzFcemI5nEIf_XgdTJJnqSnyojD_s34A6YtPXA</recordid><startdate>20240702</startdate><enddate>20240702</enddate><creator>Ibrahim, Mahmoud</creator><creator>Yasmina Al Khalil</creator><creator>Amirrajab, Sina</creator><creator>Chang, Sun</creator><creator>Breeuwer, Marcel</creator><creator>Pluim, Josien</creator><creator>Elen, Bart</creator><creator>Ertaylan, Gokhan</creator><creator>Dumontier, Michel</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240702</creationdate><title>Generative AI for Synthetic Data Across Multiple Medical Modalities: A Systematic Review of Recent Developments and Challenges</title><author>Ibrahim, Mahmoud ; Yasmina Al Khalil ; Amirrajab, Sina ; Chang, Sun ; Breeuwer, Marcel ; Pluim, Josien ; Elen, Bart ; Ertaylan, Gokhan ; Dumontier, Michel</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_30754397993</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Comparative studies</topic><topic>Computed tomography</topic><topic>Generative artificial intelligence</topic><topic>Image segmentation</topic><topic>Medical imaging</topic><topic>Synthetic data</topic><topic>Systematic review</topic><topic>Translations</topic><toplevel>online_resources</toplevel><creatorcontrib>Ibrahim, Mahmoud</creatorcontrib><creatorcontrib>Yasmina Al Khalil</creatorcontrib><creatorcontrib>Amirrajab, Sina</creatorcontrib><creatorcontrib>Chang, Sun</creatorcontrib><creatorcontrib>Breeuwer, Marcel</creatorcontrib><creatorcontrib>Pluim, Josien</creatorcontrib><creatorcontrib>Elen, Bart</creatorcontrib><creatorcontrib>Ertaylan, Gokhan</creatorcontrib><creatorcontrib>Dumontier, Michel</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content (ProQuest)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ibrahim, Mahmoud</au><au>Yasmina Al Khalil</au><au>Amirrajab, Sina</au><au>Chang, Sun</au><au>Breeuwer, Marcel</au><au>Pluim, Josien</au><au>Elen, Bart</au><au>Ertaylan, Gokhan</au><au>Dumontier, Michel</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Generative AI for Synthetic Data Across Multiple Medical Modalities: A Systematic Review of Recent Developments and Challenges</atitle><jtitle>arXiv.org</jtitle><date>2024-07-02</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>This paper presents a comprehensive systematic review of generative models (GANs, VAEs, DMs, and LLMs) used to synthesize various medical data types, including imaging (dermoscopic, mammographic, ultrasound, CT, MRI, and X-ray), text, time-series, and tabular data (EHR). Unlike previous narrowly focused reviews, our study encompasses a broad array of medical data modalities and explores various generative models. Our search strategy queries databases such as Scopus, PubMed, and ArXiv, focusing on recent works from January 2021 to November 2023, excluding reviews and perspectives. This period emphasizes recent advancements beyond GANs, which have been extensively covered previously. The survey reveals insights from three key aspects: (1) Synthesis applications and purpose of synthesis, (2) generation techniques, and (3) evaluation methods. It highlights clinically valid synthesis applications, demonstrating the potential of synthetic data to tackle diverse clinical requirements. While conditional models incorporating class labels, segmentation masks and image translations are prevalent, there is a gap in utilizing prior clinical knowledge and patient-specific context, suggesting a need for more personalized synthesis approaches and emphasizing the importance of tailoring generative approaches to the unique characteristics of medical data. Additionally, there is a significant gap in using synthetic data beyond augmentation, such as for validation and evaluation of downstream medical AI models. The survey uncovers that the lack of standardized evaluation methodologies tailored to medical images is a barrier to clinical application, underscoring the need for in-depth evaluation approaches, benchmarking, and comparative studies to promote openness and collaboration.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2024-07
issn 2331-8422
language eng
recordid cdi_proquest_journals_3075439799
source Publicly Available Content (ProQuest)
subjects Comparative studies
Computed tomography
Generative artificial intelligence
Image segmentation
Medical imaging
Synthetic data
Systematic review
Translations
title Generative AI for Synthetic Data Across Multiple Medical Modalities: A Systematic Review of Recent Developments and Challenges
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-29T19%3A59%3A30IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Generative%20AI%20for%20Synthetic%20Data%20Across%20Multiple%20Medical%20Modalities:%20A%20Systematic%20Review%20of%20Recent%20Developments%20and%20Challenges&rft.jtitle=arXiv.org&rft.au=Ibrahim,%20Mahmoud&rft.date=2024-07-02&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3075439799%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_30754397993%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3075439799&rft_id=info:pmid/&rfr_iscdi=true