Özet
Pediatric pneumonia (PP) remains an important topic in undergraduate medical education and offers a suitable framework for evaluating large language models (LLMs) in AI-assisted learning. We developed a 27 open-ended survey including five core domains of PP, such as diagnosis, etiology, diagnostics, treatment, and prevention. DeepSeek V3, Gemini 2.0, and ChatGPT-3.5, were each provided with identical reference materials. Two pediatric infectious disease specialists independently assessed responses using a structured 10-point rubric through Licert, which presents a custom evaluation tool. DeepSeek V3 achieved the highest mean score (9.9), outperforming ChatGPT-3.5 (7.7), and Gemini 2.0 (7.5) through all domains (p < 0.001). Moreover, it received a full score in 26 out of 27 questions (96.3%) and achieved an accuracy score of ≥ 5. In addition, the highest performance appeared in higher-order reasoning areas, including age-specific etiology and imaging interpretation, with DeepSeek V3 outperforming others by up to 3.2 points. While all models demonstrated almost safety, the variability in content quality however highlights the necessity for careful platform selection. Therefore, future research should explore educational outcomes comparing AI-assisted and conventional learning approaches to better define the role of LLMs in medical education.
| Orijinal dil | İngilizce |
|---|---|
| Makale numarası | 40342 |
| Dergi | Scientific Reports |
| Hacim | 15 |
| Basın numarası | 1 |
| DOI'lar | |
| Yayın durumu | Yayınlandı - Ara 2025 |
BM SKH
Bu sonuç, aşağıdaki Sürdürülebilir Kalkınma Hedefine/Hedeflerine katkıda bulunur
-
SKH 3 Sağlık ve Kaliteli Yaşam
Parmak izi
Comparing ChatGPT-3.5, Gemini 2.0, and DeepSeek V3 for pediatric pneumonia learning in medical students' araştırma başlıklarına git. Birlikte benzersiz bir parmak izi oluştururlar.Bundan alıntı yap
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver