Product Launchmlxlocal inferenceollamaqwen3.5

Ollama Boosts Mac Performance With MLX

|March 31, 2026

8.1

Relevance Score

Photo: images.macrumors.com · rights & takedowns

Ollama released a preview update (Ollama 0.19) on March 31, 2026, that uses Apple's MLX framework to accelerate local AI inference on Macs with Apple silicon. The company reports about 1.6× faster prefill speeds and nearly 2× faster decode speeds, with the largest improvements on M5-series chips and smarter memory management. The preview requires more than 32GB unified memory and currently supports Alibaba's Qwen3.5.

Scoring Rationale

Official product preview from Ollama offers measurable performance gains (1.6× prefill, ~2× decode) and is directly actionable for Mac users; scored high for actionability, credibility, and relevance. Scope is limited to Apple-silicon Macs and current model support is narrow, which slightly reduces novelty and breadth.