Loading…

Using artificial intelligence to estimate the nutritional content of meal photos: an evaluation of ChatGPT-4

Dietary intake assessment is an essential part of nutrition research and practice, with the use of digital technology now well established(1) and artificial intelligence (AI) in the form of image recognition readily available in research and commercial settings(2). Recent advances in large language...

Full description

Saved in:
Bibliographic Details
Published in:Proceedings of the Nutrition Society 2024-11, Vol.83 (OCE4)
Main Authors: O’Hara, C., Kent, G., Flynn, A.C., Gibney, E.R., Timon, C.M.
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Dietary intake assessment is an essential part of nutrition research and practice, with the use of digital technology now well established(1) and artificial intelligence (AI) in the form of image recognition readily available in research and commercial settings(2). Recent advances in large language models (LLMs), such as ChatGPT, allow computers to converse in a human-like way providing text responses to typed queries. No studies, however, have utilised both the LLM and image recognition components of ChatGPT-4 to evaluate its accuracy to estimate nutritional content of meals. The aim of this study was to evaluate the accuracy of the ChatGPT-4 LLM and image recognition model in estimating the nutritional content of meals. Thirty-eight meal photographs with known nutritional content (from McCance and Widdowson’s Composition of Foods) were uploaded to ChatGPT, and it was asked to provide point estimates for each of the meals for each of the following: energy (kcal), protein (g), total carbohydrate (g), dietary fibre (g), total sugar (g), total fat (g), saturated fat (g), monounsaturated fat (g), polyunsaturated fat (g), calcium (mg), iron (mg), sodium (mg), potassium (mg), vitamin D (mcg), folate (mcg), and vitamin C (mg). Comparisons were made between ChatGPT estimates and those from McCance and Widdowson using the Wilcoxon signed rank test, percent difference, Spearman’s correlation, and cross-classification of quartiles. Interpretation of statistical measures was based on Lombard et al.(3). For estimating the content of meals, differences (p < 0.05) existed between the methods for 11 of the 16 nutrients, and 12 nutrients had a percent difference of >10%, indicating poor agreement for most nutrients. ChatGPT underestimated 15 of the 16 nutrients. Conversely, when considering the ranking of meals, all nutrients had correlation coefficients which indicated good (rs ≥ 0.50)(11 of 16) or acceptable (0.20 < rs < 0.49)(5 of 16) agreement. In the cross-classification of quartiles, ≥50% of meals were classified into the same quartile by both methods for 9 nutrients and 10% of meals were classified into opposite quartiles for 14 nutrients, indicating good agreement. ChatGPT also provided caveats regarding its estimations such as “the caloric estimate assumes the butter is spread thinly” and “cornflakes can often be fortified with vitamins and minerals […] and exact content could also vary based on the brand of cornflakes”. ChatGPT showed poor agreement for estimati
ISSN:0029-6651
1475-2719
DOI:10.1017/S0029665124005743