Models & Researchdigital pathologyspatial proteomicsvision transformertcga brca

Researchers predict protein cascade expression from H&E images

|May 4, 2026|By LDS Team

6.2

Relevance Score

Researchers predict protein cascade expression from H&E images

The medRxiv preprint "Predicting Protein Cascade Expression from H&E Images" (Alejandro Leyva et al., medRxiv, 2026 Jan 24; DOI 10.64898/2026.01.23.26344725) reports a method to estimate downstream protein expression from routine H&E whole-slide images using Reverse Phase Protein Array (RPPA) measurements from the Cancer Genome Atlas Breast Adenocarcinoma dataset (TCGA-BRCA). The paper targets five proteins from the apoptosis cascade and uses DNA damage and repair (DDR) proteins as a biological control, per the preprint. The authors compare patch-level Vision Transformers (ViT) against a cellular-level ViT they call CellRPPA (referred to as CellViT in parts of the manuscript), finding patch-level ViTs achieve R-squared values below 0.1 while CellViT attains R-squared values above 0.1 across five test folds, according to the preprint. The study is posted to medRxiv and indexed in PubMed as a preprint; PubMed notes it has not yet been peer reviewed.

What happened

The medRxiv preprint "Predicting Protein Cascade Expression from H&E Images" (Alejandro Leyva et al., medRxiv, 2026 Jan 24; DOI 10.64898/2026.01.23.26344725) presents a computational pipeline that links routine hematoxylin and eosin (H&E) whole-slide images to downstream protein expression measured by Reverse Phase Protein Array (RPPA). According to the preprint, the authors used RPPA data paired with WSIs from the Cancer Genome Atlas Breast Adenocarcinoma dataset (TCGA-BRCA) to predict the expression of five proteins drawn from the apoptosis cascade, with DNA damage and repair (DDR) cascade proteins used as a control. The manuscript reports that patch-level Vision Transformers (ViT) produced R-squared values below 0.1, while a cellular-level Vision Transformer named CellRPPA (also cited as CellViT in the paper) produced R-squared values above 0.1 across five test folds, per the preprint.

Technical details

The preprint frames the task as a regression problem linking morphology to protein intensities measured by RPPA. The authors describe a cellular-level ViT architecture (CellRPPA/CellViT) that operates on single-cell or cell-centric representations rather than large image patches; the paper contrasts this with standard patch-based ViTs and reports the comparative R-squared results cited above (medRxiv preprint). The study also compares predictive performance across biological pathways, reporting higher predictive signal for morphologically indicative cascades such as apoptosis versus the DDR control, per the preprint.

Methodologically, the paper targets a harder task than single-protein prediction

estimating downstream cascade-level signals that reflect propagated protein activity. Industry-pattern observations: prior digital pathology efforts have had limited success predicting bulk or single-marker proteomics from H&E when tissue-level signal is weak; architectures that incorporate cellular context or segmentation-derived features often outperform coarse patch-based models on tasks sensitive to cell state and microenvironment. For practitioners, a cellular-level transformer approach aligns with a broader trend to fuse single-cell morphology with spatial omics when available.

Context and significance

Editorial analysis

The work is notable for attempting cascade-level inference from routine H&E - which, if reproducible and robust across cohorts - could expand the actionable readouts derivable from archival histology. However, the reported R-squared thresholds (around 0.1) indicate limited explained variance; readers should treat results as early-stage proof-of-concept rather than production-ready biomarkers. The manuscript is posted as a preprint and indexed in PubMed; PubMed labels it as not yet peer reviewed (PubMed entry).

What to watch

For practitioners

look for peer-reviewed publication, external validation on independent cohorts beyond TCGA-BRCA, ablation studies showing which cellular features drive signal, and comparisons to spatial proteomics modalities. Observers should also watch whether CellRPPA implementations, code, and trained weights are released for reproducibility testing.

Key Points

1A medRxiv preprint links H&E slides to cascade-level RPPA protein signals using TCGA-BRCA paired data, targeting five apoptosis proteins.
2A cellular-level Vision Transformer (CellRPPA/CellViT) outperformed patch-level ViTs, but predictive R-squared values remained near 0.1, indicating modest explained variance.
3Editorial analysis: cellular-context models align with industry trends to fuse morphology and spatial omics, but robust external validation is required before clinical use.

Scoring Rationale

This is a notable methodological preprint linking routine H&E to cascade-level proteomics, which matters to computational-pathology practitioners. Reported effect sizes are modest and the work is currently a preprint, so impact is limited until peer review and external validation.

Sources

Public references used for this report.

5 sources

journals.plos.orgPredicting protein cascade expression from H&E images

pmc.ncbi.nlm.nih.govPredicting Protein Cascade Expression from H&E Images

pubmed.ncbi.nlm.nih.govPredicting Protein Cascade Expression from H&E Images

View 2 more sources

Practice interview problems based on real data

1,625 SQL & Python problems across 15 industry datasets — the exact type of data you work with.

Try 250 free problems

Models & Researchdigital pathologyspatial proteomicsvision transformertcga brca