GitHub Copilot moves to token-based billing from June 1

GitHub announced in a company blog post that GitHub Copilot will transition from the current Premium Requests system to token-based, usage billing on June 1, 2026 (GitHub Blog). Under the new system every Copilot plan includes a monthly allotment of GitHub AI Credits, and additional usage is billed based on token consumption, input, output, and cached tokens, using listed API rates for each model (GitHub Blog; GitHub Docs). GitHub says code completions and Next Edit suggestions remain free, while long agentic sessions and large codebase operations will consume credits faster (GitHub Blog; Ars Technica). GitHub has published a preview bill tool and model pricing reference pages to help customers estimate costs (GitHub Blog; GitHub Docs).
What happened
Per a GitHub company blog post, GitHub Copilot will replace its per-request Premium Request Unit (PRU) system with a token-based usage billing model starting June 1, 2026. The announced change moves subscribers to a monthly allotment of GitHub AI Credits and charges additional usage according to token consumption, including input, output, and cached tokens, priced using the platform's published API rates for each model (GitHub Blog; GitHub Docs). GitHub also said it will provide a preview bill experience in early May so users can view projected costs on their Billing Overview page (GitHub Blog).
Technical details
Per GitHub's documentation and announcement, a single AI Credit is valued at $0.01 and usage converts from token counts into AI Credits using model-specific rates (GitHub Blog; GitHub Docs). A token is roughly three-quarters of a word, and both prompt input and model output count toward consumption; cached tokens are also included in the calculation (GitHub Docs; ghacks.net). GitHub stated that simple editor features such as inline code completions and Next Edit suggestions will remain free; higher-cost features such as multi-step, agentic sessions and repository-scale operations will consume AI Credits at a materially higher rate (GitHub Blog; Ars Technica).
Editorial analysis - technical context
Industry-pattern observations: Providers shifting from per-request or flat subscriptions to token- or compute-based pricing aim to align charges with backend inference costs. This pattern has appeared across other AI platforms where model choice, context length, and repeated long-running agentic workflows drive disproportionate compute. For practitioners, that means predictable per-seat subscription economics change into variable, usage-driven costs tied to model selection and session length.
Context and significance
Editorial analysis: GitHub framed the move as a response to "escalating inference cost" and the emergence of agentic, long-running workflows. Reporting from Ars Technica and The Register highlights the same rationale and notes that GitHub had been subsidizing heavy users under the PRU system (GitHub Blog; Ars Technica; The Register). For development teams and platform owners, this reduces cost cross-subsidization: simple completions remain free while compute-heavy tasks become metered. That change alters cost predictability for teams that have automated large-scale code operations or built agents on Copilot.
What to watch
For practitioners: Monitor three indicators visible in GitHub's interfaces and docs. 1) The early-May preview bill output in Billing Overview to see model-level consumption profiles (GitHub Blog). 2) The published per-token API rates in GitHub Docs, which determine how quickly AI Credits are spent for different models (GitHub Docs). 3) Product telemetry on which Copilot features actually consume credits in practice; GitHub has said code completions and Next Edit remain free but warned repository-scale agentic runs will cost more (GitHub Blog; ghacks.net).
Implications for teams and tooling
Editorial analysis: Organizations running CI/automation, repository analysis, or multi-agent coding workflows should treat Copilot as a metered cloud compute resource going forward. Observers should expect a need to track token volumes per workflow and to incorporate model-choice and context-length controls into cost management. Openly published per-token rates let teams model spend, but the non-deterministic nature of token usage means projected bills can differ from actuals unless teams instrument usage closely.
Reported quotes and transparency
GitHub's announcement included the statement, "Today, a quick chat question and a multi-hour autonomous coding session can cost the user the same amount," written by Mario Rodriguez, GitHub's chief product officer (GitHub Blog). GitHub has also published Docs pages listing model and pricing references and a usage-based billing guide for individuals (GitHub Docs).
Bottom line
Editorial analysis: The move aligns Copilot with broader industry billing models that charge for inference and context length rather than abstract request counts. Teams that depend on Copilot for short, interactive completions will see little change; teams that run repository-scale or long-running agent tasks should use the preview billing tools and the published per-token rates to anticipate and manage new variable costs.
Scoring Rationale
This is a notable commercial change affecting large numbers of developers and orgs that use Copilot. It materially alters cost predictability for heavy, agentic workflows but leaves basic completions free, making it important for engineering managers and platform teams to reassess tooling spend.
Practice with real Payments data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all Payments problems
