Nvidia Rolls Out GPT-5.5 to 10K Employees

NVIDIA has given more than 10,000 employees early access to OpenAI's `GPT-5.5` via the agentic coding app Codex, running on NVIDIA `GB200 NVL72` rack-scale systems. Internal reports describe dramatic productivity gains: debugging cycles cut from days to hours, multi-file experiments finishing overnight, and teams shipping end-to-end features from natural-language prompts. NVIDIA cites hardware efficiency improvements, claiming 35x lower cost per million tokens and 50x higher token output per second per megawatt versus prior-generation systems. The rollout uses per-agent dedicated VMs, a zero-data-retention policy, and read-only production access to balance power with enterprise security. The deployment signals tighter OpenAI-NVIDIA integration and makes frontier-model inference more viable at enterprise scale.
What happened
NVIDIA and OpenAI have pushed a company-wide pilot of OpenAI's latest frontier model, `GPT-5.5`, into production inside NVIDIA. Over 10,000 NVIDIANs across engineering, product, legal, marketing, finance, sales, HR, operations and developer programs have early access to Codex, the agentic coding and automation application powered by `GPT-5.5` and served on `GB200 NVL72` rack-scale systems. NVIDIA reports measurable productivity gains, with employees calling results "mind-blowing" and CEO Jensen Huang urging use, "Let's jump to lightspeed. Welcome to the age of AI."
Technical details
The deployment is architected for high-throughput, low-cost inference on NVIDIA hardware and includes several engineering and security choices practitioners should note.
- •Scale and performance: NVIDIA claims the `GB200 NVL72` configuration delivers 35x lower cost per million tokens and 50x higher token output per second per megawatt versus prior-generation systems, making frontier-model inference viable for broad internal consumption.
- •Runtime stack and optimizations: OpenAI model weights are optimized for NVIDIA inference stacks such as TensorRT-LLM, and NVIDIA supports other runtimes like vLLM and Ollama for ecosystem portability and latency tuning.
- •Agent design and integration: Codex acts as an agent layer connecting natural-language prompts to code generation, CI workflows, and internal tools; teams report faster end-to-end feature delivery and fewer wasted experimentation cycles.
- •Enterprise security posture: Each agent is provisioned with a dedicated VM, secured SSH access, read-only production permissions for agents, a zero-data-retention policy, and an internal toolkit named Skills to mediate operations and auditing.
Context and significance
This deployment is a practical indicator of two converging trends: frontier models moving from research to daily engineering workflows, and hardware vendors embedding those models into operational stacks. NVIDIA is not only supplying GPUs but also showcasing a full-stack play where model performance claims and cost-per-token economics justify company-wide usage. For OpenAI, running GPT-5.5 on NVIDIA systems reinforces a day-zero partnership narrative that dates back to early DGX deliveries, and it positions Codex as a direct counter to competitors in agentic software engineering. Token efficiency and reduced operational cost are central because they change the unit economics of running large models inside enterprises, and they influence platform choice when engineering teams evaluate tradeoffs between providers.
What to watch
Validate NVIDIA's efficiency claims independently, monitor whether similar deployments appear at other frontier-model vendors, and track changes in internal guardrails, token accounting, and compliance as GPT-5.5 moves from pilot to general availability.
Scoring Rationale
This is a notable operational milestone: large-scale, secure internal use of a frontier model on vendor-optimized hardware materially affects engineering productivity and enterprise economics. The story combines product integration and infrastructure efficiency, but it is not a paradigm-shifting public model release.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems

