Evaluation of the usability of ChatGPT-4 Pro and Gemini 2.5 Pro in patient education about brain tumors

Umut Ogün Mutlucan(1), Cihan Bedel(2), Fatih Selvi(3), Ökkeş Zortuk(4), Cezmi Çağrı Türk(5)
(1) Health Science University Antalya Training and Research Hospital, Department of Neurosurgery, Antalya, Türkiye,
(2) Health Science University Antalya Training and Research Hospital, Department of Emergency Medicine, Antalya, Türkiye,
(3) Health Science University Antalya Training and Research Hospital, Department of Emergency Medicine, Antalya, Türkiye,
(4) Hatay Research and Training Hospital, Department of Emergency Medicine, Hatay, Türkiye,
(5) Health Science University Antalya Training and Research Hospital, Department of Neurosurgery, Antalya, Türkiye

Abstract

Aim: The aim of this study is to determine the reliability of ChatGPT-4 Pro and Gemini 2.5 Pro chatbots through a systematic evaluation of the responses provided by neurosurgical specialists to patients' inquiries regarding brain tumors.


Methods: The present study was conducted using artificial intelligence programs, and there is no relationship between the authors and the AI companies and sites associated with the study. The final tally revealed that a total of 56 frequently asked questions were identified. The present study will examine the ChatGPT-4 Pro and Gemini 2.5 Pro. The responses furnished by both artificial intelligence models were produced in Turkish and subsequently assessed by two independent neurosurgeons. The evaluation of the responses was conducted by two independent evaluators, who assigned scores without the knowledge of each other's evaluations. In the event that two neurosurgeons assigned the same score to a given response, it was accepted as final. In instances of discordance, a collaborative discussion was initiated, culminating in the determination and documentation of a consensus score.


Results: The distribution of these questions is illustrated in Table 2. The mean GPT score for anatomy questions was 4.25 ± 0.88, while the mean GEMINI score was 4.50 ± 0.53 (p = 0.282). For inquiries pertaining to general questions, the mean GPT score was 4.43 ± 0.81, while the mean GEMINI score was 4.38 ± 0.81 (p = 0.500). Inquiries pertaining to prognostication and daily living activities revealed a mean GPT score of 5.00 ± 0.00, accompanied by a mean GEMINI score of 4.57 ± 0.78 (p = 0.100). In the treatment questions, the mean GPT score was 4.63 ± 0.67, while the mean GEMINI score was 4.36 ± 0.67 (p = 0.138). Figure 1 presents a comparative analysis categorized by inquiry group. A comparison of the mean scores revealed that GPT and GEMINI exhibited similar performance, with mean scores of 4.54 and 4.44, respectively.


Conclusion: The present study demonstrates that large language model (LLM) technologies, including ChatGPT-4 Pro and Gemini 2.5 Pro, exhibit considerable promise in the provision of information and guidance to patients. Furthermore, the investigation revealed that artificial intelligence models do not demonstrate a substantial degree of superiority in the education of patients regarding brain tumors.

Full text article

Generated from XML file

Authors

Umut Ogün Mutlucan
Cihan Bedel
Fatih Selvi
Ökkeş Zortuk
o.zortuk@gmail.com (Primary Contact)
Cezmi Çağrı Türk
1.
Mutlucan UO, Bedel C, Selvi F, Zortuk Ökkeş, Türk C Çağrı. Evaluation of the usability of ChatGPT-4 Pro and Gemini 2.5 Pro in patient education about brain tumors. J Med Dent Invest. 2025;6:e250132. doi:10.5577/jomdi.e250132

Article Details

How to Cite

1.
Mutlucan UO, Bedel C, Selvi F, Zortuk Ökkeş, Türk C Çağrı. Evaluation of the usability of ChatGPT-4 Pro and Gemini 2.5 Pro in patient education about brain tumors. J Med Dent Invest. 2025;6:e250132. doi:10.5577/jomdi.e250132
Smart Citations via scite_

Similar Articles

You may also start an advanced similarity search for this article.

Emerging technologies and advanced therapies in temporomandibular disorder (TMD) management: A 2024 update

Anushi Goel Anushi Goel, Gargi Sarma, Sadhvi Pandit, Vivek Govekar
Abstract View : 0