Case Studyqwen3.5jetson thoredge deploymentagents
BrainiaK Deploys Pipeline On Jetson Thor T5000
8.2
Relevance ScoreIn mid-2025 engineers deployed BrainiaK — a full agentic knowledge pipeline — on NVIDIA's Jetson Thor T5000, running a 122B Qwen3.5 model quantized to AWQ-4bit entirely in 128 GB unified LPDDR5X memory. The stack uses vLLM, Docker Compose, composite memory, tool execution, and MathCore, delivering ~13 tokens/second and up to 32,000-token contexts on a single edge device. This demonstrates reproducible, on-premise agent deployment without cloud inference.


