Loading…

Assessing ChatGPT vs. Standard Medical Resources for Endoscopic Sleeve Gastroplasty Education: A Medical Professional Evaluation Study

Background and Aims The Chat Generative Pre-Trained Transformer (ChatGPT) represents a significant advancement in artificial intelligence (AI) chatbot technology. While ChatGPT offers promising capabilities, concerns remain about its reliability and accuracy. This study aims to evaluate ChatGPT’s re...

Full description

Saved in:

Bibliographic Details
Published in:	Obesity surgery 2024-07, Vol.34 (7), p.2718-2724
Main Authors:	Aburumman, Razan, Al Annan, Karim, Mrad, Rudy, Brunaldi, Vitor O., Gala, Khushboo, Abu Dayyeh, Barham K.
Format:	Article
Language:	English
Subjects:	Accuracy Artificial Intelligence Bariatric Surgery Chatbots Endoscopy Female Gastroplasty - methods Humans Male Medicine Medicine & Public Health New Concept Obesity, Morbid - surgery Reproducibility of Results Surgery
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites	cdi_FETCH-LOGICAL-c326t-503b0111f84219d49e54f3709a213e9f49f9895d77421f006831a5fe3fd9702b3
container_end_page	2724
container_issue	7
container_start_page	2718
container_title	Obesity surgery
container_volume	34
creator	Aburumman, Razan Al Annan, Karim Mrad, Rudy Brunaldi, Vitor O. Gala, Khushboo Abu Dayyeh, Barham K.
description	Background and Aims The Chat Generative Pre-Trained Transformer (ChatGPT) represents a significant advancement in artificial intelligence (AI) chatbot technology. While ChatGPT offers promising capabilities, concerns remain about its reliability and accuracy. This study aims to evaluate ChatGPT’s responses to patients’ frequently asked questions about Endoscopic Sleeve Gastroplasty (ESG). Methods Expert Gastroenterologists and Bariatric Surgeons, with experience in ESG, were invited to evaluate ChatGPT-generated answers to eight ESG-related questions, and answers sourced from hospital websites. The evaluation criteria included ease of understanding, scientific accuracy, and overall answer satisfaction. They were also tasked with discerning whether each response was AI generated or not. Results Twelve medical professionals with expertise in ESG participated, 83.3% of whom had experience performing the procedure independently. The entire cohort possessed substantial knowledge about ESG. ChatGPT’s utility among participants, rated on a scale of one to five, averaged 2.75. The raters demonstrated a 54% accuracy rate in distinguishing AI-generated responses, with a sensitivity of 39% and specificity of 60%, resulting in an average of 17.6 correct identifications out of a possible 31. Overall, there were no significant differences between AI-generated and non-AI responses in terms of scientific accuracy, understandability, and satisfaction, with one notable exception. For the question defining ESG, the AI-generated definition scored higher in scientific accuracy (4.33 vs. 3.61, p = 0.007) and satisfaction (4.33 vs. 3.58, p = 0.009) compared to the non-AI versions. Conclusions This study underscores ChatGPT’s efficacy in providing medical information on ESG, demonstrating its comparability to traditional sources in scientific accuracy. Graphical Abstract
doi_str_mv	10.1007/s11695-024-07283-5
format	article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_3056665365</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3074234084</sourcerecordid><originalsourceid>FETCH-LOGICAL-c326t-503b0111f84219d49e54f3709a213e9f49f9895d77421f006831a5fe3fd9702b3</originalsourceid><addsrcrecordid>eNp9kcFqGzEQhkVpaRynL9BDEfSSy7ojabUr5WaM6xRSGpL0LOSVlG5YrxzNrsEvkOeuHKcp9FAQDGI-fTPoJ-QjgxkDqL8gY5WWBfCygJorUcg3ZMJqUAWUXL0lE9AVFEpzcUJOER8AOKs4f09OhKqlkkxOyNMc0SO2_T1d_LLD6vqO7nBGbwfbO5sc_e5d29iO3niMY2o80hATXfYuYhO3bUNvO-93nq4sDiluu1z2dOnGxg5t7C_o_NVwnWI4TIp9vix3thufkTxqdPsz8i7YDv2HlzolP78u7xaXxdWP1bfF_KpoBK-GQoJYA2MsqJIz7UrtZRlEDdpyJrwOpQ5aaenqOvcDQKUEszJ4EZyuga_FlJwfvdsUH0ePg9m02Pius72PIxoBsqoqKfKZks__oA_5B_LyByr7RQmqzBQ_Uk2KiMkHs03txqa9YWAOKZljSianZJ5TMgf1pxf1uN549_rkTywZEEcAc6u_9-nv7P9ofwPUMZyc</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3074234084</pqid></control><display><type>article</type><title>Assessing ChatGPT vs. Standard Medical Resources for Endoscopic Sleeve Gastroplasty Education: A Medical Professional Evaluation Study</title><source>Springer Link</source><creator>Aburumman, Razan ; Al Annan, Karim ; Mrad, Rudy ; Brunaldi, Vitor O. ; Gala, Khushboo ; Abu Dayyeh, Barham K.</creator><creatorcontrib>Aburumman, Razan ; Al Annan, Karim ; Mrad, Rudy ; Brunaldi, Vitor O. ; Gala, Khushboo ; Abu Dayyeh, Barham K.</creatorcontrib><description>Background and Aims The Chat Generative Pre-Trained Transformer (ChatGPT) represents a significant advancement in artificial intelligence (AI) chatbot technology. While ChatGPT offers promising capabilities, concerns remain about its reliability and accuracy. This study aims to evaluate ChatGPT’s responses to patients’ frequently asked questions about Endoscopic Sleeve Gastroplasty (ESG). Methods Expert Gastroenterologists and Bariatric Surgeons, with experience in ESG, were invited to evaluate ChatGPT-generated answers to eight ESG-related questions, and answers sourced from hospital websites. The evaluation criteria included ease of understanding, scientific accuracy, and overall answer satisfaction. They were also tasked with discerning whether each response was AI generated or not. Results Twelve medical professionals with expertise in ESG participated, 83.3% of whom had experience performing the procedure independently. The entire cohort possessed substantial knowledge about ESG. ChatGPT’s utility among participants, rated on a scale of one to five, averaged 2.75. The raters demonstrated a 54% accuracy rate in distinguishing AI-generated responses, with a sensitivity of 39% and specificity of 60%, resulting in an average of 17.6 correct identifications out of a possible 31. Overall, there were no significant differences between AI-generated and non-AI responses in terms of scientific accuracy, understandability, and satisfaction, with one notable exception. For the question defining ESG, the AI-generated definition scored higher in scientific accuracy (4.33 vs. 3.61, p = 0.007) and satisfaction (4.33 vs. 3.58, p = 0.009) compared to the non-AI versions. Conclusions This study underscores ChatGPT’s efficacy in providing medical information on ESG, demonstrating its comparability to traditional sources in scientific accuracy. Graphical Abstract</description><identifier>ISSN: 0960-8923</identifier><identifier>ISSN: 1708-0428</identifier><identifier>EISSN: 1708-0428</identifier><identifier>DOI: 10.1007/s11695-024-07283-5</identifier><identifier>PMID: 38758515</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Accuracy ; Artificial Intelligence ; Bariatric Surgery ; Chatbots ; Endoscopy ; Female ; Gastroplasty - methods ; Humans ; Male ; Medicine ; Medicine & Public Health ; New Concept ; Obesity, Morbid - surgery ; Reproducibility of Results ; Surgery</subject><ispartof>Obesity surgery, 2024-07, Vol.34 (7), p.2718-2724</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><rights>2024. The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c326t-503b0111f84219d49e54f3709a213e9f49f9895d77421f006831a5fe3fd9702b3</cites><orcidid>0000-0001-8084-7225</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/38758515$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Aburumman, Razan</creatorcontrib><creatorcontrib>Al Annan, Karim</creatorcontrib><creatorcontrib>Mrad, Rudy</creatorcontrib><creatorcontrib>Brunaldi, Vitor O.</creatorcontrib><creatorcontrib>Gala, Khushboo</creatorcontrib><creatorcontrib>Abu Dayyeh, Barham K.</creatorcontrib><title>Assessing ChatGPT vs. Standard Medical Resources for Endoscopic Sleeve Gastroplasty Education: A Medical Professional Evaluation Study</title><title>Obesity surgery</title><addtitle>OBES SURG</addtitle><addtitle>Obes Surg</addtitle><description>Background and Aims The Chat Generative Pre-Trained Transformer (ChatGPT) represents a significant advancement in artificial intelligence (AI) chatbot technology. While ChatGPT offers promising capabilities, concerns remain about its reliability and accuracy. This study aims to evaluate ChatGPT’s responses to patients’ frequently asked questions about Endoscopic Sleeve Gastroplasty (ESG). Methods Expert Gastroenterologists and Bariatric Surgeons, with experience in ESG, were invited to evaluate ChatGPT-generated answers to eight ESG-related questions, and answers sourced from hospital websites. The evaluation criteria included ease of understanding, scientific accuracy, and overall answer satisfaction. They were also tasked with discerning whether each response was AI generated or not. Results Twelve medical professionals with expertise in ESG participated, 83.3% of whom had experience performing the procedure independently. The entire cohort possessed substantial knowledge about ESG. ChatGPT’s utility among participants, rated on a scale of one to five, averaged 2.75. The raters demonstrated a 54% accuracy rate in distinguishing AI-generated responses, with a sensitivity of 39% and specificity of 60%, resulting in an average of 17.6 correct identifications out of a possible 31. Overall, there were no significant differences between AI-generated and non-AI responses in terms of scientific accuracy, understandability, and satisfaction, with one notable exception. For the question defining ESG, the AI-generated definition scored higher in scientific accuracy (4.33 vs. 3.61, p = 0.007) and satisfaction (4.33 vs. 3.58, p = 0.009) compared to the non-AI versions. Conclusions This study underscores ChatGPT’s efficacy in providing medical information on ESG, demonstrating its comparability to traditional sources in scientific accuracy. Graphical Abstract</description><subject>Accuracy</subject><subject>Artificial Intelligence</subject><subject>Bariatric Surgery</subject><subject>Chatbots</subject><subject>Endoscopy</subject><subject>Female</subject><subject>Gastroplasty - methods</subject><subject>Humans</subject><subject>Male</subject><subject>Medicine</subject><subject>Medicine & Public Health</subject><subject>New Concept</subject><subject>Obesity, Morbid - surgery</subject><subject>Reproducibility of Results</subject><subject>Surgery</subject><issn>0960-8923</issn><issn>1708-0428</issn><issn>1708-0428</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kcFqGzEQhkVpaRynL9BDEfSSy7ojabUr5WaM6xRSGpL0LOSVlG5YrxzNrsEvkOeuHKcp9FAQDGI-fTPoJ-QjgxkDqL8gY5WWBfCygJorUcg3ZMJqUAWUXL0lE9AVFEpzcUJOER8AOKs4f09OhKqlkkxOyNMc0SO2_T1d_LLD6vqO7nBGbwfbO5sc_e5d29iO3niMY2o80hATXfYuYhO3bUNvO-93nq4sDiluu1z2dOnGxg5t7C_o_NVwnWI4TIp9vix3thufkTxqdPsz8i7YDv2HlzolP78u7xaXxdWP1bfF_KpoBK-GQoJYA2MsqJIz7UrtZRlEDdpyJrwOpQ5aaenqOvcDQKUEszJ4EZyuga_FlJwfvdsUH0ePg9m02Pius72PIxoBsqoqKfKZks__oA_5B_LyByr7RQmqzBQ_Uk2KiMkHs03txqa9YWAOKZljSianZJ5TMgf1pxf1uN549_rkTywZEEcAc6u_9-nv7P9ofwPUMZyc</recordid><startdate>20240701</startdate><enddate>20240701</enddate><creator>Aburumman, Razan</creator><creator>Al Annan, Karim</creator><creator>Mrad, Rudy</creator><creator>Brunaldi, Vitor O.</creator><creator>Gala, Khushboo</creator><creator>Abu Dayyeh, Barham K.</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>K9.</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0001-8084-7225</orcidid></search><sort><creationdate>20240701</creationdate><title>Assessing ChatGPT vs. Standard Medical Resources for Endoscopic Sleeve Gastroplasty Education: A Medical Professional Evaluation Study</title><author>Aburumman, Razan ; Al Annan, Karim ; Mrad, Rudy ; Brunaldi, Vitor O. ; Gala, Khushboo ; Abu Dayyeh, Barham K.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c326t-503b0111f84219d49e54f3709a213e9f49f9895d77421f006831a5fe3fd9702b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Accuracy</topic><topic>Artificial Intelligence</topic><topic>Bariatric Surgery</topic><topic>Chatbots</topic><topic>Endoscopy</topic><topic>Female</topic><topic>Gastroplasty - methods</topic><topic>Humans</topic><topic>Male</topic><topic>Medicine</topic><topic>Medicine & Public Health</topic><topic>New Concept</topic><topic>Obesity, Morbid - surgery</topic><topic>Reproducibility of Results</topic><topic>Surgery</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Aburumman, Razan</creatorcontrib><creatorcontrib>Al Annan, Karim</creatorcontrib><creatorcontrib>Mrad, Rudy</creatorcontrib><creatorcontrib>Brunaldi, Vitor O.</creatorcontrib><creatorcontrib>Gala, Khushboo</creatorcontrib><creatorcontrib>Abu Dayyeh, Barham K.</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>MEDLINE - Academic</collection><jtitle>Obesity surgery</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Aburumman, Razan</au><au>Al Annan, Karim</au><au>Mrad, Rudy</au><au>Brunaldi, Vitor O.</au><au>Gala, Khushboo</au><au>Abu Dayyeh, Barham K.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Assessing ChatGPT vs. Standard Medical Resources for Endoscopic Sleeve Gastroplasty Education: A Medical Professional Evaluation Study</atitle><jtitle>Obesity surgery</jtitle><stitle>OBES SURG</stitle><addtitle>Obes Surg</addtitle><date>2024-07-01</date><risdate>2024</risdate><volume>34</volume><issue>7</issue><spage>2718</spage><epage>2724</epage><pages>2718-2724</pages><issn>0960-8923</issn><issn>1708-0428</issn><eissn>1708-0428</eissn><abstract>Background and Aims The Chat Generative Pre-Trained Transformer (ChatGPT) represents a significant advancement in artificial intelligence (AI) chatbot technology. While ChatGPT offers promising capabilities, concerns remain about its reliability and accuracy. This study aims to evaluate ChatGPT’s responses to patients’ frequently asked questions about Endoscopic Sleeve Gastroplasty (ESG). Methods Expert Gastroenterologists and Bariatric Surgeons, with experience in ESG, were invited to evaluate ChatGPT-generated answers to eight ESG-related questions, and answers sourced from hospital websites. The evaluation criteria included ease of understanding, scientific accuracy, and overall answer satisfaction. They were also tasked with discerning whether each response was AI generated or not. Results Twelve medical professionals with expertise in ESG participated, 83.3% of whom had experience performing the procedure independently. The entire cohort possessed substantial knowledge about ESG. ChatGPT’s utility among participants, rated on a scale of one to five, averaged 2.75. The raters demonstrated a 54% accuracy rate in distinguishing AI-generated responses, with a sensitivity of 39% and specificity of 60%, resulting in an average of 17.6 correct identifications out of a possible 31. Overall, there were no significant differences between AI-generated and non-AI responses in terms of scientific accuracy, understandability, and satisfaction, with one notable exception. For the question defining ESG, the AI-generated definition scored higher in scientific accuracy (4.33 vs. 3.61, p = 0.007) and satisfaction (4.33 vs. 3.58, p = 0.009) compared to the non-AI versions. Conclusions This study underscores ChatGPT’s efficacy in providing medical information on ESG, demonstrating its comparability to traditional sources in scientific accuracy. Graphical Abstract</abstract><cop>New York</cop><pub>Springer US</pub><pmid>38758515</pmid><doi>10.1007/s11695-024-07283-5</doi><tpages>7</tpages><orcidid>https://orcid.org/0000-0001-8084-7225</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 0960-8923
ispartof	Obesity surgery, 2024-07, Vol.34 (7), p.2718-2724
issn	0960-8923 1708-0428 1708-0428
language	eng
recordid	cdi_proquest_miscellaneous_3056665365
source	Springer Link
subjects	Accuracy Artificial Intelligence Bariatric Surgery Chatbots Endoscopy Female Gastroplasty - methods Humans Male Medicine Medicine & Public Health New Concept Obesity, Morbid - surgery Reproducibility of Results Surgery
title	Assessing ChatGPT vs. Standard Medical Resources for Endoscopic Sleeve Gastroplasty Education: A Medical Professional Evaluation Study
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T01%3A54%3A02IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Assessing%20ChatGPT%20vs.%20Standard%20Medical%20Resources%20for%20Endoscopic%20Sleeve%20Gastroplasty%20Education:%20A%20Medical%20Professional%20Evaluation%20Study&rft.jtitle=Obesity%20surgery&rft.au=Aburumman,%20Razan&rft.date=2024-07-01&rft.volume=34&rft.issue=7&rft.spage=2718&rft.epage=2724&rft.pages=2718-2724&rft.issn=0960-8923&rft.eissn=1708-0428&rft_id=info:doi/10.1007/s11695-024-07283-5&rft_dat=%3Cproquest_cross%3E3074234084%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c326t-503b0111f84219d49e54f3709a213e9f49f9895d77421f006831a5fe3fd9702b3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3074234084&rft_id=info:pmid/38758515&rfr_iscdi=true