Accuracy of polyphenol content information in berries: A comparative analysis of ChatGPT and Phenol-Explorer - UTU Research Portal

A1 Refereed original research article in a scientific journal

Accuracy of polyphenol content information in berries: A comparative analysis of ChatGPT and Phenol-Explorer

Authors: Sarıkaya, Buse; Kaya Kaçar, Hüsna

Publication year: 2025

Journal: Nutrition and health

ISSN: 0260-1060

eISSN: 2047-945X

DOI: https://doi.org/10.1177/02601060251408541

Publication's open availability at the time of reporting: No Open Access

Publication channel's open availability : Partially Open Access publication channel

Web address : https://doi.org/10.1177/02601060251408541

Self-archived copy’s web address: https://research.utu.fi/converis/portal/detail/Publication/508538301

Self-archived copy's version: Final draft

Abstract

Background

Polyphenols are widely occurring bioactive compounds in fruits and are extensively investigated for their potential health effects. The growing prominence of artificial intelligence tools in nutrition science necessitates evaluating their capacity to provide accurate biochemical data.

Aim

This analysis aims to assess the reliability of two models ChatGPT-4o mini (free version) and ChatGPT-4o (paid version) in predicting polyphenol compound concentrations and their potential use in nutritional research and health applications.

Methods

Seven different berries were selected for the study, and their anthocyanins, flavonols, phenolic acids, lignans, and stilbenes were queried in three different sessions using both ChatGPT-4o mini (free version) and ChatGPT-4o (paid version). The responses were compared with those from Phenol-Explorer, and the evaluation was based on relative accuracy (%).

Results

No significant difference in relative accuracy (%) was found between ChatGPT-4o mini (41.36 ± 34.74) and ChatGPT-4o (46.23 ± 34.01) models (p > 0.05; Cohen's d = −0.107). In ChatGPT-4o mini, the highest mean accuracy was observed for total polyphenols (68.01 ± 25.00%; significantly higher than flavonols, p < 0.01), followed by anthocyanins (58.95 ± 32.68%). In ChatGPT-4o, anthocyanins showed the highest accuracy (65.36 ± 38.17%; significantly higher than flavonols, p < 0.01, and stilbenes, p < 0.001) followed closely by total polyphenols (65.72 ± 20.93%). Accuracy for flavonols, phenolic acids, and stilbenes was lower than for other compounds.

Conclusion

This study shows that ChatGPT-4o mini and ChatGPT-4o exhibit varying accuracy in predicting polyphenols, with higher accuracy for common compounds l

Downloadable publication

This is an electronic reprint of the original article.
This reprint may differ from the original in pagination and typographic detail. Please cite the original version.

NAH-25-0294.R2_Proof_hi (1).pdf

Funding information in the publication:
The authors received no financial support for the research, authorship, and/or publication of this article.