Phi-4-Reasoning-Vision-15B Launches Open-Weight Multimodal Reasoning Model

Developers announce Phi-4-reasoning-vision-15B, a 15 billion-parameter open-weight multimodal reasoning model available via Microsoft Foundry, HuggingFace, and GitHub. Trained with about 200 billion multimodal tokens and a SigLIP-2 Naflex vision encoder, it emphasizes efficient mid-fusion design and excels at math, science reasoning, and GUI understanding. The model aims to improve accuracy-to-compute trade-offs for interactive vision-language tasks.
Scoring Rationale
Strong practical impact and official release enable immediate adoption; limited novelty beyond efficiency improvements in existing Phi model family.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems
