Loading…

Accuracy and reproducibility of ChatGPT's free version answers about endometriosis

Objective To evaluate the accuracy and reproducibility of ChatGPT's free version answers about endometriosis for the first time. Methods Detailed internet searches to identify frequently asked questions (FAQs) about endometriosis have been performed. Scientific questions were prepared in accord...

Full description

Saved in:
Bibliographic Details
Published in:International journal of gynecology and obstetrics 2024-05, Vol.165 (2), p.691-695
Main Authors: Ozgor, Bahar Yuksel, Simavi, Melek Azade
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Objective To evaluate the accuracy and reproducibility of ChatGPT's free version answers about endometriosis for the first time. Methods Detailed internet searches to identify frequently asked questions (FAQs) about endometriosis have been performed. Scientific questions were prepared in accordance with the European Society of Human Reproduction and Embryology (ESHRE) endometriosis guidelines. An experienced gynecologist gave a score of 1–4 for each ChatGPT answer. The repeatability of ChatGPT answers about endometriosis was analyzed by asking each question twice, and the reproducibility of ChatGPT was accepted as scoring the answer to the same question in the same score category. Results A total of 91.4% (n = 71) of all FAQs were answered completely, accurately, and sufficiently. ChatGPT had the highest accuracy in the symptom and diagnosis category (94.1%, 16/17 questions) and the lowest accuracy in the treatment category (81.3%, 13/16 questions). Furthermore, of the 40 questions based on the ESHRE endometriosis guidelines, 27 (67.5%) were classified as grade 1, seven (17.5%) as grade 2, and six (15.0%) as grade 3. The reproducibility rate of FAQs in the prevention, symptoms, and diagnosis, and complications categories was the highest (100% for all categories). The reproducibility rate was the lowest for questions based on the ESHRE endometriosis guidelines (70.0%). Conclusion ChatGPT accurately and satisfactorily responded to more than 90% of the questions about endometriosis, but to only 67.5% of questions based on the ESHRE endometriosis guidelines. Synopsis ChatGPT can answer questions about endometriosis, but on a guideline basis the accuracy and sufficiency are lower.
ISSN:0020-7292
1879-3479
DOI:10.1002/ijgo.15309