What happened
The medRxiv preprint "Predicting Protein Cascade Expression from H&E Images" (Alejandro Leyva et al., medRxiv, 2026 Jan 24; DOI 10.64898/2026.01.23.26344725) presents a computational pipeline that links routine hematoxylin and eosin (H&E) whole-slide images to downstream protein expression measured by Reverse Phase Protein Array (RPPA). According to the preprint, the authors used RPPA data paired with WSIs from the Cancer Genome Atlas Breast Adenocarcinoma dataset (TCGA-BRCA) to predict the expression of five proteins drawn from the apoptosis cascade, with DNA damage and repair (DDR) cascade proteins used as a control. The manuscript reports that patch-level Vision Transformers (ViT) produced R-squared values below 0.1, while a cellular-level Vision Transformer named CellRPPA (also cited as CellViT in the paper) produced R-squared values above 0.1 across five test folds, per the preprint.
Technical details
The preprint frames the task as a regression problem linking morphology to protein intensities measured by RPPA. The authors describe a cellular-level ViT architecture (CellRPPA/CellViT) that operates on single-cell or cell-centric representations rather than large image patches; the paper contrasts this with standard patch-based ViTs and reports the comparative R-squared results cited above (medRxiv preprint). The study also compares predictive performance across biological pathways, reporting higher predictive signal for morphologically indicative cascades such as apoptosis versus the DDR control, per the preprint.
Editorial analysis
estimating downstream cascade-level signals that reflect propagated protein activity. Industry-pattern observations: prior digital pathology efforts have had limited success predicting bulk or single-marker proteomics from H&E when tissue-level signal is weak; architectures that incorporate cellular context or segmentation-derived features often outperform coarse patch-based models on tasks sensitive to cell state and microenvironment. For practitioners, a cellular-level transformer approach aligns with a broader trend to fuse single-cell morphology with spatial omics when available.
Context and significance
Editorial analysis: The work is notable for attempting cascade-level inference from routine H&E - which, if reproducible and robust across cohorts - could expand the actionable readouts derivable from archival histology. However, the reported R-squared thresholds (around 0.1) indicate limited explained variance; readers should treat results as early-stage proof-of-concept rather than production-ready biomarkers. The manuscript is posted as a preprint and indexed in PubMed; PubMed labels it as not yet peer reviewed (PubMed entry).
What to watch
For practitioners: look for peer-reviewed publication, external validation on independent cohorts beyond TCGA-BRCA, ablation studies showing which cellular features drive signal, and comparisons to spatial proteomics modalities. Observers should also watch whether CellRPPA implementations, code, and trained weights are released for reproducibility testing.
Key Points
- 1A medRxiv preprint links H&E slides to cascade-level RPPA protein signals using TCGA-BRCA paired data, targeting five apoptosis proteins.
- 2A cellular-level Vision Transformer (CellRPPA/CellViT) outperformed patch-level ViTs, but predictive R-squared values remained near 0.1, indicating modest explained variance.
- 3Editorial analysis: cellular-context models align with industry trends to fuse morphology and spatial omics, but robust external validation is required before clinical use.
Scoring Rationale
This is a notable methodological preprint linking routine H&E to cascade-level proteomics, which matters to computational-pathology practitioners. Reported effect sizes are modest and the work is currently a preprint, so impact is limited until peer review and external validation.
Practice interview problems based on real data
1,625 SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems

