Cakmur, Basar BurakKoluman, Ali CanCiftci, Mehmet UtkuCiftci, Ebru AlogluZiroglu, Nezih2026-01-152026-01-1520250942-20561433-734710.1002/ksa.702472-s2.0-105026030219https://doi.org/10.1002/ksa.70247https://hdl.handle.net/20.500.14517/8700Purpose The aim of this study was to comparatively evaluate the responses generated by three advanced artificial intelligence (AI) models, ChatGPT-4o (OpenAI), Gemini 1.5 Flash (Google) and DeepSeek-V3, to frequently asked patient questions about meniscal tears in terms of reliability, usefulness, quality, and readability. Methods Responses from three AI chatbots, ChatGPT-4o (OpenAI), Gemini 1.5 Flash (Google) and DeepSeek-V3 (DeepSeek AI), were evaluated for 20 common patient questions regarding meniscal tears. Three orthopaedic specialists independently scored reliability and usefulness on 7-point Likert scales and overall response quality using the 5-point Global Quality Scale. Readability was analysed with six established indices. Inter-rater agreement was examined with intraclass correlation coefficients (ICCs) and Fleiss' Kappa, while between-model differences were tested using Kruskal-Wallis and ANOVA with Bonferroni adjustment. Results Gemini 1.5 Flash achieved the highest reliability, significantly outperforming both GPT-4o and DeepSeek-V3 (p = 0.001). While usefulness scores were broadly similar, Gemini was superior to DeepSeek-V3 (p = 0.045). Global Quality Scale scores did not differ significantly among models. In contrast, GPT-4o consistently provided the most readable content (p < 0.001). Inter-rater reliability was excellent across all evaluation domains (ICC > 0.9). Conclusion All three AI models generated high-quality educational content regarding meniscal tears. Gemini 1.5 Flash demonstrated the highest reliability and usefulness, while GPT-4o provided significantly more readable responses. These findings highlight the trade-off between reliability and readability in AI-generated patient education materials and emphasise the importance of physician oversight to ensure safe, evidence-based integration of these tools into clinical practice.eninfo:eu-repo/semantics/openAccessChatGPTDeepSeekGeminiLarge Language ModelsMeniscal TearPatient EducationGemini 1.5 Flash Provides the Most Reliable Content While ChatGPT-4o Offers the Highest Readability for Patient Education on Meniscal TearsArticleQ1Q1WOS:00164949370000141451752