OKR-CELL Introduces Robust Single-Cell Foundation Model
Haoran Wang and colleagues on Jan. 9, 2026 preprint OKR-CELL, a cross-modal foundation model for single-cell multi-omics that was pretrained on 32 million cell-text pairs. The approach uses LLM-based retrieval-augmented generation to enrich cell descriptions and a Cross-modal Robust Alignment objective incorporating reliability scoring, curriculum learning, and coupled momentum contrastive learning. OKR-CELL achieves leading results across six tasks including clustering, annotation, batch correction, and zero-shot retrieval.
Key Points
- 1Pretrains OKR-CELL on 32 million cell-text pairs using a cross-modal cell-language framework
- 2Integrates LLM-based RAG to enrich textual cell descriptions with open-world biological knowledge
- 3Implements Cross-modal Robust Alignment with reliability scoring and contrastive curriculum to mitigate noisy modalities
Scoring Rationale
Strong methodological novelty and large-scale evaluation, but limited by domain scope and preprint single-source status.
Sources
Public references used for this report.
Practice with real FinTech & Trading data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all FinTech & Trading problems