This study investigates plasma biomarkers to aid stroke diagnosis using cross-platform proteomics and machine learning. In a case–control design, adults with suspected stroke were enrolled in emergency departments before treatment, with discovery (n=100) spanning acute ischemic stroke, intracerebral hemorrhage, transient ischemic attack, and stroke mimics, and external validation (n=80) with equal representation across groups.
SomaScan proteomics quantified 7307 proteins, yielding 61 proteins differentiating stroke subtypes. Protein panels were derived via LASSO logistic regression, with internal validation by repeated nested cross-validation (rCV) and targeted MS, and external validation by data-independent acquisition MS in an independent Yale cohort.
Reported performance shows subtype-specific classifiers: 7 for acute ischemic stroke (rCV-AUC 0.82; 95% CI 0.78–0.86), 6 for intracerebral hemorrhage (AUC 0.70; 95% CI 0.64–0.76), 8 for transient ischemic attack (AUC 0.78; 95% CI 0.73–0.84), and 7 for stroke mimics (AUC 0.81; 95% CI 0.77–0.86). Internal validation identified 11 proteins; external validation identified 32 proteins, with VTN, PLG, and S100A9 highlighted among top classifiers.
Uncertainty remains regarding generalizability beyond the study cohorts, and larger multicenter validation is suggested.