Product Launchmistral 4bvoxtral realtimespeech recognitionopen source
Voxtral Implements Pure C Realtime Inference
7.1
Relevance ScoreA new open-source project provides a pure C implementation of the inference pipeline for Mistral AI's Voxtral Realtime 4B model, with zero external dependencies and support for MPS and BLAS backends. The release includes a streaming C API, a simple Python reference, memory-mapped BF16 weights (~8.9GB), chunked audio encoding, and a rolling 8192-position KV cache for long transcripts. It enables portable, dependency-free transcription on macOS and Linux.



