Product Launchmultimodalvision languagemath reasoningopen source
Phi-4-Reasoning-Vision-15B Launches Open-Weight Multimodal Reasoning Model
8.3
Relevance Score
Developers announce Phi-4-reasoning-vision-15B, a 15 billion-parameter open-weight multimodal reasoning model available via Microsoft Foundry, HuggingFace, and GitHub. Trained with about 200 billion multimodal tokens and a SigLIP-2 Naflex vision encoder, it emphasizes efficient mid-fusion design and excels at math, science reasoning, and GUI understanding. The model aims to improve accuracy-to-compute trade-offs for interactive vision-language tasks.



