BackgroundMetabolic dysfunction-associated steatotic liver disease (MASLD) is the most prevalent chronic liver disease, ranging from simple steatosis (MASL) to metabolic dysfunction-associated steatohepatitis (MASH). However, reliable noninvasive strategies for accurately distinguishing MASL from MASH at an early stage remain limited.
We therefore aimed to develop a robust molecular model to improve early identification of disease progression and subtype discrimination.MethodsFive datasets from the Gene Expression Omnibus were integrated as a training cohort comprising 149 MASL and 158 MASH samples, while another dataset GSE135251 served as validation cohort including 51 MASL and 155 MASH samples. Differential expression analysis and weighted gene co expression network analysis were conducted to identify gene modules.
Overlapping genes were subjected to protein interaction network construction and topological ranking. Least absolute shrinkage and selection operator regression, support vector machine recursive feature elimination, and random forest algorithms were jointly applied to derive robust diagnostic candidates.
An artificial neural network classifier was established based on the final gene set and evaluated in both cohorts. Immune cell composition was estimated using CIBERSORT.
Single cell RNA sequencing data from GSE136103 were analyzed to determine cell type specific expression patterns. Quantitative real time PCR validation was conducted in 60 clinical liver tissue samples.ResultsA total of 656 differentially expressed genes were identified between MASL and MASH.
Network integration and machine learning intersection analysis consistently yielded six key genes: MMP9, FABP5, TREM2, CTSD, UBD, and MAP2K1. Five genes were upregulated in MASH, whereas MAP2K1 was downregulated.
Individual genes demonstrated moderate diagnostic performance, with area under the curve values ranging from 0.692 to 0.822 in the training cohort. The artificial neural network model achieved an area under the curve of 0.893 (95% CI 0.854 to 0.925) in the validation cohort.
Immune infiltration analysis revealed increased monocytes, M0 and M1 macrophages, and activated dendritic cells in MASH. Single cell analysis localized key genes predominantly to myeloid populations, and quantitative PCR confirmed consistent differential expression in clinical samples.ConclusionThis study establishes a multicohort machine learning-based gene signature with high diagnostic accuracy for distinguishing MASL from MASH and provides insight into immune metabolic mechanisms underlying disease progression.
Frontiers in Immunology published a clinical update in Infectious Disease on 20 Apr 2026. The item focuses on Integrated transcriptomic and single cell analysis combined with artificial neural network identifies a robust gene signature for early discrimination of MASL and MASH. Open the detail page to review the full original feed content.