Skip to main navigation Skip to search Skip to main content

Comparing ChatGPT-3.5, Gemini 2.0, and DeepSeek V3 for pediatric pneumonia learning in medical students

  • Hacettepe University
  • Korea University
  • Witten/Herdecke University

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Pediatric pneumonia (PP) remains an important topic in undergraduate medical education and offers a suitable framework for evaluating large language models (LLMs) in AI-assisted learning. We developed a 27 open-ended survey including five core domains of PP, such as diagnosis, etiology, diagnostics, treatment, and prevention. DeepSeek V3, Gemini 2.0, and ChatGPT-3.5, were each provided with identical reference materials. Two pediatric infectious disease specialists independently assessed responses using a structured 10-point rubric through Licert, which presents a custom evaluation tool. DeepSeek V3 achieved the highest mean score (9.9), outperforming ChatGPT-3.5 (7.7), and Gemini 2.0 (7.5) through all domains (p < 0.001). Moreover, it received a full score in 26 out of 27 questions (96.3%) and achieved an accuracy score of ≥ 5. In addition, the highest performance appeared in higher-order reasoning areas, including age-specific etiology and imaging interpretation, with DeepSeek V3 outperforming others by up to 3.2 points. While all models demonstrated almost safety, the variability in content quality however highlights the necessity for careful platform selection. Therefore, future research should explore educational outcomes comparing AI-assisted and conventional learning approaches to better define the role of LLMs in medical education.

Original languageEnglish
Article number40342
JournalScientific Reports
Volume15
Issue number1
DOIs
Publication statusPublished - Dec 2025

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

Keywords

  • Artificial intelligence
  • Children
  • Lower respiratory tract infection
  • Medical education
  • Pneumonia

Fingerprint

Dive into the research topics of 'Comparing ChatGPT-3.5, Gemini 2.0, and DeepSeek V3 for pediatric pneumonia learning in medical students'. Together they form a unique fingerprint.

Cite this