Loading…
B - 113 Assessing the Neuropsychology Information Base of Large Language Models
Abstract Objective Research has demonstrated that Large Language Models (LLMs) can obtain passing scores on medical board-certification examinations and have made substantial improvements in recent years (e.g., ChatGPT-4 and ChatGPT-3.5 demonstrating an accuracy of 83.4% and 73.4%, respectively, on...
Saved in:
Published in: | Archives of clinical neuropsychology 2024-10, Vol.39 (7), p.1214-1215 |
---|---|
Main Authors: | , , , , , , |
Format: | Article |
Language: | English |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Abstract
Objective
Research has demonstrated that Large Language Models (LLMs) can obtain passing scores on medical board-certification examinations and have made substantial improvements in recent years (e.g., ChatGPT-4 and ChatGPT-3.5 demonstrating an accuracy of 83.4% and 73.4%, respectively, on neurosurgical practice written board-certification questions). To date, the extent of LLMs’ neuropsychology domain information has not been investigated. This study is an initial exploration of ChatGPT-3.5, ChatGPT-4, and Gemini’s performance on mock clinical neuropsychology written board-certification examination questions.
Methods
Six hundred practice examination questions were obtained from the BRAIN American Academy of Clinical Neuropsychology (AACN) website. Data for specific question domains and pediatric subclassification were available for 300 items. Using an a priori prompting strategy, the questions were input into ChatGPT-3.5, ChatGPT-4, and Gemini. Responses were scored based on BRAIN AACN answer keys. Chi-squared tests assessed LLMs’ performance overall and within domains, and significance was set at p = 0.002 using Bonferroni correction.
Results
Across all six hundred items, ChatGPT-4 had superior accuracy (74%) to ChatGPT-3.5 (62.5%) and Gemini (52.7%; p’s |
---|---|
ISSN: | 1873-5843 1873-5843 |
DOI: | 10.1093/arclin/acae067.274 |