Loading…
Diagnosis in Bytes: Comparing the Diagnostic Accuracy of Google and ChatGPT 3.5 as an Educational Support Tool
Adopting advanced digital technologies as diagnostic support tools in healthcare is an unquestionable trend accelerated by the COVID-19 pandemic. However, their accuracy in suggesting diagnoses remains controversial and needs to be explored. We aimed to evaluate and compare the diagnostic accuracy o...
Saved in:
Published in: | International journal of environmental research and public health 2024-05, Vol.21 (5), p.580 |
---|---|
Main Authors: | , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | cdi_FETCH-LOGICAL-c1630-417013d31d22333dcb94a9e3ff80dd09ee3d11ce85887ac19b5822c02d6a008f3 |
container_end_page | |
container_issue | 5 |
container_start_page | 580 |
container_title | International journal of environmental research and public health |
container_volume | 21 |
creator | Guimaraes, Guilherme R Figueiredo, Ricardo G Silva, Caroline Santos Arata, Vanessa Contreras, Jean Carlos Z Gomes, Cristiano M Tiraboschi, Ricardo B Bessa Junior, José |
description | Adopting advanced digital technologies as diagnostic support tools in healthcare is an unquestionable trend accelerated by the COVID-19 pandemic. However, their accuracy in suggesting diagnoses remains controversial and needs to be explored. We aimed to evaluate and compare the diagnostic accuracy of two free accessible internet search tools: Google and ChatGPT 3.5.
To assess the effectiveness of both medical platforms, we conducted evaluations using a sample of 60 clinical cases related to urological pathologies. We organized the urological cases into two distinct categories for our analysis: (i) prevalent conditions, which were compiled using the most common symptoms, as outlined by EAU and UpToDate guidelines, and (ii) unusual disorders, identified through case reports published in the 'Urology Case Reports' journal from 2022 to 2023. The outcomes were meticulously classified into three categories to determine the accuracy of each platform: "correct diagnosis", "likely differential diagnosis", and "incorrect diagnosis". A group of experts evaluated the responses blindly and randomly.
For commonly encountered urological conditions, Google's accuracy was 53.3%, with an additional 23.3% of its results falling within a plausible range of differential diagnoses, and the remaining outcomes were incorrect. ChatGPT 3.5 outperformed Google with an accuracy of 86.6%, provided a likely differential diagnosis in 13.3% of cases, and made no unsuitable diagnosis. In evaluating unusual disorders, Google failed to deliver any correct diagnoses but proposed a likely differential diagnosis in 20% of cases. ChatGPT 3.5 identified the proper diagnosis in 16.6% of rare cases and offered a reasonable differential diagnosis in half of the cases.
ChatGPT 3.5 demonstrated higher diagnostic accuracy than Google in both contexts. The platform showed satisfactory accuracy when diagnosing common cases, yet its performance in identifying rare conditions remains limited. |
doi_str_mv | 10.3390/ijerph21050580 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_3060383544</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3059508463</sourcerecordid><originalsourceid>FETCH-LOGICAL-c1630-417013d31d22333dcb94a9e3ff80dd09ee3d11ce85887ac19b5822c02d6a008f3</originalsourceid><addsrcrecordid>eNpdkc1LAzEQxYMoVqtXjxLw4qV1srMfibdaaxUKCtbzkibZNmW7WZPdQ_97t7SKepph-M3j8R4hVwyGiALu7Nr4ehUxSCDhcETOWJrCIE6BHf_ae-Q8hDUA8jgVp6SHPBMsE_EZqR6tXFYu2EBtRR-2jQn3dOw2tfS2WtJmZeiBaKyiI6VaL9WWuoJOnVuWhspK0_FKNtO3OcVhQmXoTnSiWyUb6ypZ0ve2rp1v6Ny58oKcFLIM5vIw--TjaTIfPw9mr9OX8Wg2UCzFzjPLgKFGpqMIEbVaiFgKg0XBQWsQxqBmTBmecJ5JxcQi4VGkINKpBOAF9sntXrf27rM1ock3NihTlrIyrg05QtqFgUkcd-jNP3TtWt8Z31GJSKDLDDtquKeUdyF4U-S1txvptzmDfNdE_reJ7uH6INsuNkb_4N_R4xeIn4Le</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3059508463</pqid></control><display><type>article</type><title>Diagnosis in Bytes: Comparing the Diagnostic Accuracy of Google and ChatGPT 3.5 as an Educational Support Tool</title><source>Publicly Available Content Database</source><source>PubMed Central</source><source>Free Full-Text Journals in Chemistry</source><source>Coronavirus Research Database</source><creator>Guimaraes, Guilherme R ; Figueiredo, Ricardo G ; Silva, Caroline Santos ; Arata, Vanessa ; Contreras, Jean Carlos Z ; Gomes, Cristiano M ; Tiraboschi, Ricardo B ; Bessa Junior, José</creator><creatorcontrib>Guimaraes, Guilherme R ; Figueiredo, Ricardo G ; Silva, Caroline Santos ; Arata, Vanessa ; Contreras, Jean Carlos Z ; Gomes, Cristiano M ; Tiraboschi, Ricardo B ; Bessa Junior, José</creatorcontrib><description>Adopting advanced digital technologies as diagnostic support tools in healthcare is an unquestionable trend accelerated by the COVID-19 pandemic. However, their accuracy in suggesting diagnoses remains controversial and needs to be explored. We aimed to evaluate and compare the diagnostic accuracy of two free accessible internet search tools: Google and ChatGPT 3.5.
To assess the effectiveness of both medical platforms, we conducted evaluations using a sample of 60 clinical cases related to urological pathologies. We organized the urological cases into two distinct categories for our analysis: (i) prevalent conditions, which were compiled using the most common symptoms, as outlined by EAU and UpToDate guidelines, and (ii) unusual disorders, identified through case reports published in the 'Urology Case Reports' journal from 2022 to 2023. The outcomes were meticulously classified into three categories to determine the accuracy of each platform: "correct diagnosis", "likely differential diagnosis", and "incorrect diagnosis". A group of experts evaluated the responses blindly and randomly.
For commonly encountered urological conditions, Google's accuracy was 53.3%, with an additional 23.3% of its results falling within a plausible range of differential diagnoses, and the remaining outcomes were incorrect. ChatGPT 3.5 outperformed Google with an accuracy of 86.6%, provided a likely differential diagnosis in 13.3% of cases, and made no unsuitable diagnosis. In evaluating unusual disorders, Google failed to deliver any correct diagnoses but proposed a likely differential diagnosis in 20% of cases. ChatGPT 3.5 identified the proper diagnosis in 16.6% of rare cases and offered a reasonable differential diagnosis in half of the cases.
ChatGPT 3.5 demonstrated higher diagnostic accuracy than Google in both contexts. The platform showed satisfactory accuracy when diagnosing common cases, yet its performance in identifying rare conditions remains limited.</description><identifier>ISSN: 1660-4601</identifier><identifier>ISSN: 1661-7827</identifier><identifier>EISSN: 1660-4601</identifier><identifier>DOI: 10.3390/ijerph21050580</identifier><identifier>PMID: 38791794</identifier><language>eng</language><publisher>Switzerland: MDPI AG</publisher><subject>Accuracy ; Artificial intelligence ; Chatbots ; COVID-19 - diagnosis ; Diagnosis, Differential ; Education ; Humans ; Internet ; Medical students ; SARS-CoV-2 ; Search Engine ; Search engines ; Urologic Diseases - diagnosis ; Urology</subject><ispartof>International journal of environmental research and public health, 2024-05, Vol.21 (5), p.580</ispartof><rights>2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c1630-417013d31d22333dcb94a9e3ff80dd09ee3d11ce85887ac19b5822c02d6a008f3</cites><orcidid>0000-0002-4536-9077 ; 0000-0002-8486-4003</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/3059508463?pq-origsite=primo$$EPDF$$P50$$Gproquest$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/3059508463?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,25732,27903,27904,36991,36992,38495,43874,44569,74159,74873</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/38791794$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Guimaraes, Guilherme R</creatorcontrib><creatorcontrib>Figueiredo, Ricardo G</creatorcontrib><creatorcontrib>Silva, Caroline Santos</creatorcontrib><creatorcontrib>Arata, Vanessa</creatorcontrib><creatorcontrib>Contreras, Jean Carlos Z</creatorcontrib><creatorcontrib>Gomes, Cristiano M</creatorcontrib><creatorcontrib>Tiraboschi, Ricardo B</creatorcontrib><creatorcontrib>Bessa Junior, José</creatorcontrib><title>Diagnosis in Bytes: Comparing the Diagnostic Accuracy of Google and ChatGPT 3.5 as an Educational Support Tool</title><title>International journal of environmental research and public health</title><addtitle>Int J Environ Res Public Health</addtitle><description>Adopting advanced digital technologies as diagnostic support tools in healthcare is an unquestionable trend accelerated by the COVID-19 pandemic. However, their accuracy in suggesting diagnoses remains controversial and needs to be explored. We aimed to evaluate and compare the diagnostic accuracy of two free accessible internet search tools: Google and ChatGPT 3.5.
To assess the effectiveness of both medical platforms, we conducted evaluations using a sample of 60 clinical cases related to urological pathologies. We organized the urological cases into two distinct categories for our analysis: (i) prevalent conditions, which were compiled using the most common symptoms, as outlined by EAU and UpToDate guidelines, and (ii) unusual disorders, identified through case reports published in the 'Urology Case Reports' journal from 2022 to 2023. The outcomes were meticulously classified into three categories to determine the accuracy of each platform: "correct diagnosis", "likely differential diagnosis", and "incorrect diagnosis". A group of experts evaluated the responses blindly and randomly.
For commonly encountered urological conditions, Google's accuracy was 53.3%, with an additional 23.3% of its results falling within a plausible range of differential diagnoses, and the remaining outcomes were incorrect. ChatGPT 3.5 outperformed Google with an accuracy of 86.6%, provided a likely differential diagnosis in 13.3% of cases, and made no unsuitable diagnosis. In evaluating unusual disorders, Google failed to deliver any correct diagnoses but proposed a likely differential diagnosis in 20% of cases. ChatGPT 3.5 identified the proper diagnosis in 16.6% of rare cases and offered a reasonable differential diagnosis in half of the cases.
ChatGPT 3.5 demonstrated higher diagnostic accuracy than Google in both contexts. The platform showed satisfactory accuracy when diagnosing common cases, yet its performance in identifying rare conditions remains limited.</description><subject>Accuracy</subject><subject>Artificial intelligence</subject><subject>Chatbots</subject><subject>COVID-19 - diagnosis</subject><subject>Diagnosis, Differential</subject><subject>Education</subject><subject>Humans</subject><subject>Internet</subject><subject>Medical students</subject><subject>SARS-CoV-2</subject><subject>Search Engine</subject><subject>Search engines</subject><subject>Urologic Diseases - diagnosis</subject><subject>Urology</subject><issn>1660-4601</issn><issn>1661-7827</issn><issn>1660-4601</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>COVID</sourceid><sourceid>PIMPY</sourceid><recordid>eNpdkc1LAzEQxYMoVqtXjxLw4qV1srMfibdaaxUKCtbzkibZNmW7WZPdQ_97t7SKepph-M3j8R4hVwyGiALu7Nr4ehUxSCDhcETOWJrCIE6BHf_ae-Q8hDUA8jgVp6SHPBMsE_EZqR6tXFYu2EBtRR-2jQn3dOw2tfS2WtJmZeiBaKyiI6VaL9WWuoJOnVuWhspK0_FKNtO3OcVhQmXoTnSiWyUb6ypZ0ve2rp1v6Ny58oKcFLIM5vIw--TjaTIfPw9mr9OX8Wg2UCzFzjPLgKFGpqMIEbVaiFgKg0XBQWsQxqBmTBmecJ5JxcQi4VGkINKpBOAF9sntXrf27rM1ock3NihTlrIyrg05QtqFgUkcd-jNP3TtWt8Z31GJSKDLDDtquKeUdyF4U-S1txvptzmDfNdE_reJ7uH6INsuNkb_4N_R4xeIn4Le</recordid><startdate>20240501</startdate><enddate>20240501</enddate><creator>Guimaraes, Guilherme R</creator><creator>Figueiredo, Ricardo G</creator><creator>Silva, Caroline Santos</creator><creator>Arata, Vanessa</creator><creator>Contreras, Jean Carlos Z</creator><creator>Gomes, Cristiano M</creator><creator>Tiraboschi, Ricardo B</creator><creator>Bessa Junior, José</creator><general>MDPI AG</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8C1</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>CCPQU</scope><scope>COVID</scope><scope>DWQXO</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>K9.</scope><scope>M0S</scope><scope>M1P</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-4536-9077</orcidid><orcidid>https://orcid.org/0000-0002-8486-4003</orcidid></search><sort><creationdate>20240501</creationdate><title>Diagnosis in Bytes: Comparing the Diagnostic Accuracy of Google and ChatGPT 3.5 as an Educational Support Tool</title><author>Guimaraes, Guilherme R ; Figueiredo, Ricardo G ; Silva, Caroline Santos ; Arata, Vanessa ; Contreras, Jean Carlos Z ; Gomes, Cristiano M ; Tiraboschi, Ricardo B ; Bessa Junior, José</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c1630-417013d31d22333dcb94a9e3ff80dd09ee3d11ce85887ac19b5822c02d6a008f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Accuracy</topic><topic>Artificial intelligence</topic><topic>Chatbots</topic><topic>COVID-19 - diagnosis</topic><topic>Diagnosis, Differential</topic><topic>Education</topic><topic>Humans</topic><topic>Internet</topic><topic>Medical students</topic><topic>SARS-CoV-2</topic><topic>Search Engine</topic><topic>Search engines</topic><topic>Urologic Diseases - diagnosis</topic><topic>Urology</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Guimaraes, Guilherme R</creatorcontrib><creatorcontrib>Figueiredo, Ricardo G</creatorcontrib><creatorcontrib>Silva, Caroline Santos</creatorcontrib><creatorcontrib>Arata, Vanessa</creatorcontrib><creatorcontrib>Contreras, Jean Carlos Z</creatorcontrib><creatorcontrib>Gomes, Cristiano M</creatorcontrib><creatorcontrib>Tiraboschi, Ricardo B</creatorcontrib><creatorcontrib>Bessa Junior, José</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>ProQuest Public Health Database</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>ProQuest One Community College</collection><collection>Coronavirus Research Database</collection><collection>ProQuest Central Korea</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>PML(ProQuest Medical Library)</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>MEDLINE - Academic</collection><jtitle>International journal of environmental research and public health</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Guimaraes, Guilherme R</au><au>Figueiredo, Ricardo G</au><au>Silva, Caroline Santos</au><au>Arata, Vanessa</au><au>Contreras, Jean Carlos Z</au><au>Gomes, Cristiano M</au><au>Tiraboschi, Ricardo B</au><au>Bessa Junior, José</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Diagnosis in Bytes: Comparing the Diagnostic Accuracy of Google and ChatGPT 3.5 as an Educational Support Tool</atitle><jtitle>International journal of environmental research and public health</jtitle><addtitle>Int J Environ Res Public Health</addtitle><date>2024-05-01</date><risdate>2024</risdate><volume>21</volume><issue>5</issue><spage>580</spage><pages>580-</pages><issn>1660-4601</issn><issn>1661-7827</issn><eissn>1660-4601</eissn><abstract>Adopting advanced digital technologies as diagnostic support tools in healthcare is an unquestionable trend accelerated by the COVID-19 pandemic. However, their accuracy in suggesting diagnoses remains controversial and needs to be explored. We aimed to evaluate and compare the diagnostic accuracy of two free accessible internet search tools: Google and ChatGPT 3.5.
To assess the effectiveness of both medical platforms, we conducted evaluations using a sample of 60 clinical cases related to urological pathologies. We organized the urological cases into two distinct categories for our analysis: (i) prevalent conditions, which were compiled using the most common symptoms, as outlined by EAU and UpToDate guidelines, and (ii) unusual disorders, identified through case reports published in the 'Urology Case Reports' journal from 2022 to 2023. The outcomes were meticulously classified into three categories to determine the accuracy of each platform: "correct diagnosis", "likely differential diagnosis", and "incorrect diagnosis". A group of experts evaluated the responses blindly and randomly.
For commonly encountered urological conditions, Google's accuracy was 53.3%, with an additional 23.3% of its results falling within a plausible range of differential diagnoses, and the remaining outcomes were incorrect. ChatGPT 3.5 outperformed Google with an accuracy of 86.6%, provided a likely differential diagnosis in 13.3% of cases, and made no unsuitable diagnosis. In evaluating unusual disorders, Google failed to deliver any correct diagnoses but proposed a likely differential diagnosis in 20% of cases. ChatGPT 3.5 identified the proper diagnosis in 16.6% of rare cases and offered a reasonable differential diagnosis in half of the cases.
ChatGPT 3.5 demonstrated higher diagnostic accuracy than Google in both contexts. The platform showed satisfactory accuracy when diagnosing common cases, yet its performance in identifying rare conditions remains limited.</abstract><cop>Switzerland</cop><pub>MDPI AG</pub><pmid>38791794</pmid><doi>10.3390/ijerph21050580</doi><orcidid>https://orcid.org/0000-0002-4536-9077</orcidid><orcidid>https://orcid.org/0000-0002-8486-4003</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1660-4601 |
ispartof | International journal of environmental research and public health, 2024-05, Vol.21 (5), p.580 |
issn | 1660-4601 1661-7827 1660-4601 |
language | eng |
recordid | cdi_proquest_miscellaneous_3060383544 |
source | Publicly Available Content Database; PubMed Central; Free Full-Text Journals in Chemistry; Coronavirus Research Database |
subjects | Accuracy Artificial intelligence Chatbots COVID-19 - diagnosis Diagnosis, Differential Education Humans Internet Medical students SARS-CoV-2 Search Engine Search engines Urologic Diseases - diagnosis Urology |
title | Diagnosis in Bytes: Comparing the Diagnostic Accuracy of Google and ChatGPT 3.5 as an Educational Support Tool |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-22T12%3A17%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Diagnosis%20in%20Bytes:%20Comparing%20the%20Diagnostic%20Accuracy%20of%20Google%20and%20ChatGPT%203.5%20as%20an%20Educational%20Support%20Tool&rft.jtitle=International%20journal%20of%20environmental%20research%20and%20public%20health&rft.au=Guimaraes,%20Guilherme%20R&rft.date=2024-05-01&rft.volume=21&rft.issue=5&rft.spage=580&rft.pages=580-&rft.issn=1660-4601&rft.eissn=1660-4601&rft_id=info:doi/10.3390/ijerph21050580&rft_dat=%3Cproquest_cross%3E3059508463%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c1630-417013d31d22333dcb94a9e3ff80dd09ee3d11ce85887ac19b5822c02d6a008f3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3059508463&rft_id=info:pmid/38791794&rfr_iscdi=true |