Skip to main content
medichelpline
Back to Clinical Feed
Try:
PLOS ONEResearch HighlightsOpen Access

Performance and safety of a fine-tuned small language model for pediatric emergency triage: A benchmark study

04 Jun 20264 min read0 viewsJournal Feed

GIST (Key Takeaways)

  • by Eui Jun Lee, Jae Yun Jung, Do Kyun Kim, Joong Wan Park, Young Ho Kwak Pediatric emergency triage is a safety-critical task, and recent studies have explored whether artificial intelligence, including language models, can support triage decision-making; however, evidence on fine-tuned open-weight language models remains limited. We conducted a retrospective benchmark study using de-identified triage records from a tertiary pediatric emergency department in Korea collected from January 2020 to April 2025.
  • After exclusions, 74,170 encounters were included. Each encounter was reconstructed into a case-level text sequence from triage-time structured variables and nurse-authored narratives.
  • Qwen3-8B-Base was fine-tuned with Low-Rank Adaptation and Group Relative Policy Optimization using a safety-oriented reward design and was compared with a structured-data XGBoost model on a common evaluable test subset of 14,832 encounters. The fine-tuned model achieved an accuracy of 58.60%, a macro-F1 score of 0.417, and a quadratic weighted kappa of 0.535.
  • Within-one-level agreement was 97.13%, and strict under-triage, defined as true Korean Triage and Acuity Scale levels 1 or 2 predicted as levels 4 or 5, occurred in 0.65% of cases.

Clinical Editorial

Summary

PLOS ONE (Medicine) published a clinical update in Research Highlights on 04 Jun 2026.

The item focuses on Performance and safety of a fine-tuned small language model for pediatric emergency triage: A benchmark study.

Review the original article for the full source wording and details.

Source Reference

Read the full original publication from the source journal or publisher link below.