Brain-Semantoks Learns Abstract fMRI Representations For Phenotype Prediction
Researchers introduce Brain-Semantoks, a self-supervised framework for fMRI time-series representation learning, presented in a Dec 12, 2025 preprint. It uses a semantic tokenizer to aggregate noisy regional signals into network-level tokens and a self-distillation objective with a curriculum to stabilize representations across time. Learned features enable strong phenotype and cognitive prediction with linear probes and show larger unlabeled datasets improve out-of-distribution performance.
Key Points
- 1Introduces Brain-Semantoks combining semantic tokenizer and self-distillation for robust fMRI time-series representations
- 2Reduces noise sensitivity and temporal fluctuations by forming network-level tokens, improving representational stability
- 3Enables strong phenotype and cognition prediction via linear probes; scales with more unlabeled fMRI data
Scoring Rationale
Novel, practical foundation-model approach for noisy fMRI; limited by single preprint source and domain-specific focus.
Sources
Public references used for this report.
Practice interview problems based on real data
1,625 SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems