Skip to content

Microsoft Is Cutting GPT-4 Out of Copilot. Its Own Model Takes Over in August.

DS
LDS Team
Let's Data Science
8 min
At Build 2026 on June 2, Microsoft unveiled Project Polaris, an in-house coding model that replaces GPT-4 Turbo as the default for every GitHub Copilot subscriber starting in August. It runs on Microsoft's own Maia chips, ships alongside multi-agent VS Code, and ends Copilot's dependence on OpenAI.

Satya Nadella opened Microsoft Build 2026 at the Fort Mason Center in San Francisco on Tuesday with a line about where AI is going: from synchronous assistants that wait for your prompt to what he called "async coworkers" that run long tasks on their own. Then he announced the thing that proves Microsoft means it. The model that has powered GitHub Copilot for millions of developers is being replaced. Not by a newer OpenAI model. By Microsoft's own.

The model is called Project Polaris, and starting in August 2026 it becomes the default engine for every Copilot subscriber. GPT-4 Turbo, the OpenAI model that has sat under the most widely used AI coding tool in the world, gets demoted to an optional three-month fallback that developers have to actively configure before the deadline. After that, it is gone from the default path entirely.

For a product built on OpenAI's models since the day it launched, this is the cord finally being cut.

Polaris Is Microsoft's Bid to Own the Whole Stack

Project Polaris is a mixture-of-experts model, an architecture that routes each request to specialized sub-networks rather than running the entire model every time. Microsoft built specialized sub-modules tuned for different programming languages and frameworks. The company says the largest quality gains show up in low-resource languages like Rust and Haskell, the kinds of languages where training data is thin and general-purpose models tend to struggle.

It runs on Microsoft's custom Maia AI accelerators inside Azure, the same in-house silicon line the company spent two years trying to sell to an outside customer before Anthropic finally signed on. Running Copilot's default model on its own chips, in its own data centers, gives Microsoft control over the model, the inference hardware, and the developer experience end to end. Microsoft says that combination lowers per-inference latency and cost compared with the GPT-4 backend it is replacing.

Here is how Microsoft frames the swap:

GPT-4 Turbo (current)Project Polaris (August)
BuilderOpenAIMicrosoft
ArchitectureDense transformerMixture-of-experts
Inference hardwareMixedMaia accelerators on Azure
Pro-tier contextStandardUp to 100,000 lines multi-file
Benchmarks citedBaselineClaims to beat GPT-4 Turbo on HumanEval and MBPP

One number deserves a hard asterisk. The claim that Polaris outperforms GPT-4 Turbo on HumanEval and MBPP comes entirely from Microsoft. As of the keynote, no independent auditor had verified those figures. Benchmark claims from the company that built the model, about the model replacing a competitor's model, are exactly the claims a working engineer should test before trusting.

The Strategic Logic Is About OpenAI

This did not come out of nowhere. In April, Microsoft and OpenAI ended the seven-year exclusive partnership that made Azure OpenAI's primary cloud and OpenAI's models Microsoft's default AI brain. The arrangement had grown commercially awkward. OpenAI was building Codex, a direct Copilot competitor that now runs on your machine, while Microsoft was competing with OpenAI for the same enterprise accounts. Two companies sharing a user base while building rival products is a hard thing to sustain.

Project Polaris resolves the tension by removing the dependency. Microsoft now competes with OpenAI on model quality, not just on distribution. Alongside Polaris, the company announced version 2 of its MAI model suite, covering image generation, multilingual voice synthesis, and transcription, part of a broader effort to replace OpenAI-supplied models across Microsoft's products. Pricing for the version 2 models was not disclosed.

Copilot Became a Team of Agents, Not a Single Assistant

The model swap grabbed headlines, but the change that lands fastest for working developers is the new multi-agent mode for VS Code, available for adoption at Build.

Until now, Copilot routed everything through a single agent. The new architecture introduces an orchestrator that decomposes a task and spawns parallel subagents, each assigned to a discrete workstream. Linting, test generation, documentation, and security review can run simultaneously instead of one after another. The orchestrator surfaces all of it in a unified view, and developers can monitor progress and steer mid-run without losing their place. It extends the /fleet command that already let Copilot CLI dispatch parallel subagents, pulling that pattern into the editor.

Microsoft shipped a full agentic stack around it:

  • Copilot Workspace reached general availability. GitHub CEO Thomas Dohmke called it "the biggest change to Copilot since launch." It lets Copilot reason across an entire repository, propose multi-file edits, run tests, and iterate on a scoped task.
  • Autonomous Agent Mode for Copilot Enterprise arrives in July 2026, letting the platform write, test, and commit entire feature branches. Every change still needs human approval before merge, and each task runs in an ephemeral Agent Sandbox Linux container so it cannot touch the production repo until a reviewer accepts the pull request.
  • The Microsoft Agent Framework for .NET and Python, which hit a production 1.0 in April, was MIT-licensed at Build and named Microsoft's recommended standard for multi-agent systems on Azure.
  • Azure Agent Mesh, a control plane that routes agent tasks across on-premises Windows servers, Windows 365 Cloud PCs, and Azure Arc edge devices, was announced with general availability targeted for the fourth quarter of 2026.

The Other Side: Costs, Lock-In, and an Expanded Attack Surface

Three concerns sit under the announcements, and none of them showed up on the keynote slides.

The first is cost. The agentic features arrive on top of the usage-based billing model that triggered a developer backlash when one user's Copilot bill jumped to $750. Microsoft's AI Credits metering went live on June 1, the day before Build. Multi-agent mode, by definition, runs more model calls in parallel, and parallel calls cost more. Power users who already watched their bills climb have reason to read the meter carefully before turning on a feature that spawns four agents where there used to be one.

The second is lock-in. Owning the model, the chips, and the tooling is great for Microsoft and convenient for developers who stay inside the ecosystem. It is less great if Polaris underperforms on your stack and your only sanctioned escape hatch is a three-month fallback that expires in August. Teams building on the Copilot SDK should test their real workflows against Polaris during the fallback window rather than discovering its quirks after forced migration.

The third is security, and it is the most concrete. Researchers at PromptArmor demonstrated that a crafted command could bypass the Copilot CLI's read-only allowlist and execute an external payload without a confirmation dialog. Separately, researchers including Aonan Guan and a team from Johns Hopkins University showed that GitHub Actions-based AI agents, Copilot Agent among them, are vulnerable to "Comment-and-Control" attacks, where instructions hidden in a pull request title or issue comment cause an agent to leak API keys and access tokens through GitHub's own infrastructure. Multi-agent mode sends far more code context to model backends than basic completions do. More agents acting autonomously means more surface for exactly these attacks.

What Developers Should Do Before August

The practical checklist is short. If your team uses Copilot for production tooling or builds on the Copilot SDK, evaluate Polaris during the fallback window now, because automatic migration takes the choice away in August. If you are on a personal plan, verify your opt-out status under GitHub's April training-data policy, which defaults personal-plan interactions into model training unless you turn it off; Business and Enterprise plans are governed separately. And before enabling multi-agent mode at scale, audit your agent permission scopes and turn on secret scanning, especially if you run Copilot inside GitHub Actions workflows.

The Bottom Line

Microsoft just did to OpenAI inside Copilot what OpenAI has been doing to Microsoft everywhere else: it built a competing product and made it the default. Project Polaris is not only a new model. It is a declaration that Microsoft intends to own the most-used developer tool on the planet from the silicon up, with no outside dependency it cannot control.

For developers, the upside is real. A model tuned for your language, running on cheaper inference, inside an editor that now fields a team of agents instead of one. The catch is that you are being asked to trust unverified benchmarks, accept deeper lock-in, and absorb the cost and security weight of agents that act on their own. The migration is automatic. The scrutiny has to be yours.

In August, when GPT-4 Turbo quietly disappears from the default, most Copilot users will not notice the model under the hood changed at all. The ones who should notice are the teams shipping production code with it. For them, the right move is the one Microsoft did not put on a slide: test it yourself, before the fallback window closes.

Sources

Practice with real Ride-Hailing data

90 SQL & Python problems · 15 industry datasets

250 free problems · No credit card

See all Ride-Hailing problems
Free Career Roadmaps8 PATHS

Step-by-step roadmaps from zero to job-ready — curated courses, salary data, and the exact learning order that gets you hired.

Explore all career paths