ChatGPT Health showed high accuracy for moderately urgent conditions but frequently overtriaged mild cases and undertriaged emergencies. These findings highlight safety risks at clinical extremes, raising concerns about the reliability of AI tools for urgent care decision-making.
Access to this article via Institution of Civil Engineers Library is not available. Busch, F.
et al. Current applications and challenges in large language models for patient care: a systematic review.
Commun. Med.
5 , 26 (2025). A review outlining the current capabilities and limitations of LLMs in clinical care, providing context for their use in decision-making tasks such as triage.
Shekar, S., Pataranutaporn, P., Sarabu, C., Cecchi, G. A.
& Maes, P. People overtrust AI-generated medical advice despite low accuracy.
NEJM AI https://doi.org/10.1056/AIoa2300015 (2025). This study shows that users frequently rely on AI-generated medical advice even when it is incorrect, underscoring the real-world risks of triage errors.
Hager, P. et al.
Evaluation and mitigation of the limitations of large language models in clinical decision-making. Nat.
Med. 30 , 2613–2622 (2024).
Nature Medicine published a clinical update in Research Highlights on 07 May 2026.
The item focuses on ChatGPT Health triage advice falls short in key cases.
Review the original article for the full source wording and details.